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GENERAL  INTRODUCTION 


A  recurring  problem  in  many  fields,  especially  diffraction  optics, 
is  the  reconstruction  of  a  Fourier  transform  pair  g,G  from  partial 
data  on  either  or  both  functions.  Considerable  effort  has  been  expended’ 
in  the  development  of  algorithms  for  its  solution;  although  there  have 
been  some  successes,  the  problem  has  generally  proved  to  be  difficult. 

Of  particular  importance  to  the  RADC  effort  is  the  retrieval  of  wavefront 
aberrations  from  the  measured  point  spread  function  of  an  optical  system. 

Discussions  with  several  investigators  who  have  employed  their  own 
algorithms  to  these  problems  have  indicated  a  sort  of  hit-or-miss  attitude 
with  respect  to  their  behavior  in  various  situations.  Sometimes  the 
particular  algorithm  works  and  sometimes  it  fails  when-  the  data  are 
noisy.  With  the  possible  exception  of  Youla*s  recent  study,  there  are 
really  no  serious  attempts  to  understand  the  stability,  rate  of  convergence, 
etc.  with  respect  to  noise  in  the  measurements. 

The  present  contract  effort  was  devoted  to  the  development  and 
mathematical  understanding  of  new  algorithms  based  upon  numerical  functional 
analysis  which  are  robust  with  respect  to  noisy  data.  The  basic  material 
is  contained  in  the  two  sections  entitled: 

I.  MgofLitknA  ^OK  fizcjonAtAuction  pcJUtialZy  known,  bcmdtonLte.d 
TouaIza  tAon&^onm  pcuAA  ^Aom  no^y  dcUa:  I,  thz  pno  to  typical 
tcneoA  pAoblejn. 


II.  AlgoAAMimi  ^oA.  A.zcon6VtucXlon  pcuitiaZty  known,  bandtimltzd 
FouAyieA  tfiam^ofun  paiA6  {Aom  no-L6y  data:  II,  -die  nonZimoA. 
pAobZ&m  0(J  phoAe  -te-dtievod. 

Both  sections  are  very  mathematical  and  employ  mathematics  not  commonly 
encountered  by  optical  physicist  and  engineers.  For  this  reason  a 
summarizing  section  has  been  included;  it  is  the  first  section  in  the 
report  and  is  entitled 

AZgoAiMim  ^oa  Aje.con&tAuctLon  oi  paAtiaZZy  known,  bandZZmcted 
TovlAjL^a  tAonA^OAm  pcUAA  ^Aom  noZ&y  data. 
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ABSTRACT 

This  paper  is  a  summary  of  more  detailed  mathematical  work  by  the 
author  on  recovery  of  partially  known  Fourier  transforms.  These  problems  of 
inversion  of  the  finite  Fourier  transform  and  of  phase  retrieval  are  known  to 
be  ill-posed.  We  draw  a  distinction  in  the  resultant  ill-conditioning  of  the 
problems  between  global  ill-conditioning  (due  to  the  existence  of  multiple 
exact  solutions)  and  local  ill-conditioning  (due  to  the  existence  of  large 
neighborhoods  of  the  true  solution,  all  of  whose  members  are  indistinguishable 
from  the  true  solution  if  the  data  is  noisy) .  We  then  develop  extensions  of 
known  algorithms  that  attempt  to  reduce  at  least  the  effects  of  local  ill- 
conditioning  on  numerical  solutions  by  using  the  idea  of  filtered  singular 
value  decomposition,  and  present  some  numerical  examples  of  the  use  of  those 
algorithms .  '..c  ^ 
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1,  INTRODUCTION 


A  recurring  problem  in  many  fields  (especially  diffraction  optics, 
electron  microscopy,  and  X-ray  diffraction)  is  the  reconstruction  of  a  Fourier 
transform  pair  g,G  from  partial  data  on  either  or  both  functions.  Con¬ 
siderable  effort  has  been  expended  in  the  development  of  algorithms  for  its 
solution;  although  there  have  been  some  successes,  the  problem  has  generally 
proved  to  :e  difficult. 

The  canonical  examples  for  such  reconstructions  are: 

ExaTip£.e  1,  ExXAapotcution  Bojid- Limitzd  S-igmiti.  Given  a  noisy  measure¬ 
ment  g  of  g  on  an  interval  A=  )cnowledge  that  G 

vanishes  outside  the  bounded  interval  BE  [b^,b2],  reconstruct  g  and  G  on 
the  entire  real  line. 

Example  1  is  the  archetypical  linear  problem  in  transform  reconstruction. 

A  number  of  algorithms  have  been  proposed  for  its  solution,  either  by  iterative 
means:  Gerschberg  and  Saxton  [1],  Papoulis  [2],  Youla  [3]  or  by  direct  means: 
Cadzow  [4] ,  Sabri  and  Steenaart  [5] . 

Example  2,  The  Phase  PA.obietn.  Given  a  noisy  measurement  m  of  lg|  on 
an  interval  A  and  the  )cnowledge  that  G  vanishes  outside  the  bounded  inter¬ 
val  B,  reconstruct  g  and  G  on  the  entire  real  line. 

Example  2  has  been  of  theoretical  interest  for  some  time;  see  for  example 
Burge,  Fiddy,  Greenway,  and  Ross  (61;  however  in  this  most  general  form  it  has 
proved  intractable.  Nisnerical  solutions  obtained  in  particular  cases  have 
done  so  by  (sensibly)  incorporating  further  knowledge  of  g  and  G.  Two 


such  examples  are : 

EKOmpie.  2a,  The.  Two  Moduti  P^obiem.  Given  noisy  measurements  m  and  n 
of  |g|  on  A  and  |g|  on  B,  and  the  knowledge  that  G  vanishes  identic¬ 
ally  outside  of  B,  reconstruct  g,G  over  the  entire  real  line. 

Example  2b,  The  Phase  Problem  with  hionnegatyivity  Constraints.  Given  a 
noisy  measurement  m  of  |gi  on  A  and  the  knowledge  that  G  is  nonnegative 
over  B  and  vemishes  identically  outside  B  reconstruct  g,G  over  the  real 
line. 

Gerschberg  and  Saxton  [1]  developed  a  widely  used  algorithm  for  Example 
2a;  almost  all  the  iterative  algorithms  for  the  solution  of  the  general  recon¬ 
struction  problem  are  suitably  modified  versions  of  this  particular  case .  The 
Gerschberg-Saxton  algorithm  (hereafter  denoted  as  the  GS  algorithm)  in  its 
general  form  is  essentially  the  steepest  descent  algorithm  with  unit  step 
length  and  so  is  first  order  [7].  In  [8],  the  GS  algorithm  is  successfully 
applied  to  a  problem  of  the  type  in  Example  2b.  An  alternative  second  order 
algorithm,  based  on  Newton's  method,  has  been  proposed  by  Barakat  and 
Newsam  [9]. 

The  canonical  examples  presented  are  simple  in  form,  nevertheless  they 
contain  the  salient  features  that  make  other,  more  complicated,  problems  in¬ 
tractable.  The  font  of  all  difficulties  is  that  the  reconstruction  problem 
is  ill-posed,  and  badly  ill-posed  at  that.  The  original  definition  of  a  well 
posed  problem  is  due  to  Hadamard  [10] . 


Ve.{ji.yuXion.  a  problem  is  well  posed  if  the  solution 


1.  exists 

2.  is  unique 

3.  depends  continuously  on  the  data. 

If  a  problem  violates  any  of  these  conditions,  it  is  ill-posed.  Each  of  the 
model  problems  violates  at  leas t  one  of  these  criteria.  The  literature,  e.g. 
[11-13]  has  focussed  on  violations  of  uniqueness,  however  it  is  our  contention 
that  violation  of  condition  3  is  the  cause  of  the  most  of  the  problems  en¬ 
countered  in  numerical  solutions;  in  particular  it  accounts  for  the  extreme 
sensitivity  of  such  solutions  to  small  perturbations  in  the  data. 

The  purpose  of  this  paper  is  to  summarize  for  the  optical  community  the 
detailed  analysis  of  transform  reconstructions  developed  in  114-16] .  These 
three  references  contain  a  detailed  mathematical  treatment  of  ill-conditioning 
in  transform  recovery  problems  with  emphasis  on  the- implications  for  numerical 
algorithms.  The  theory  is  cUme»W-ton  independent r  although  the  supporting 
numerical  calculations  are  restricted  to  one-dimensional  problems.  We  hope  to 
present  two-dimensional  calculations  in  the  near  future. 

In  order  to  present  the  results  the  following  notation  will  be  used.  If 
N  2 

Ocs^  then  L  (D)  denotes  the  space  of  square  integrable,  complex  valued 

functions  over  D  with  norm  B*!!  and  inner  product  (•»•)•  If  T  is  any 
2  N 

set  in  L  (]R  )  then  the  projection  onto  T  is  defined  by 


y  * 


y€T  and  lly-xit  =  inf  ilz  -  xll 


(1.1) 


In  the  special  case  when 
has  the  form 


T  S  (D)  ; 


L^(D) 


will  be  abbreviated  to  P 


D' 


G(u)  U)£  D 

(PpG)  (u>)  = 

0  Otherwise 

The  Fourier  transform  (IR*^) -►  (IR^)  is  defined  to  be 


(1.2) 


g  (v) 


(^G)  (u) 


OD 


A  ^ 

2iTiv*oJ 


G(uj) 


dCi  . 


(1.3) 


N 

If  D  and  E  are  bounded  siibsets  of  3R  then  the  operator  will  be 

D  £ 

called  a  finite  Fourier  transform  (fFT) .  Finally  the  interval  [-c,c]  shall 
be  denoted  by  cl;  the  projection  P^^  shall  be  abbreviated  to  P^,  and  the 
fFT  is  represented  by 

The  next  section  of  the  paper  presents  a  survey  of  known  results  on 

example  1,  which  may  be  shown  to  be  equivalent  to  inversion  of  the  operator 
1  1/2 

^  where  J  (bj^-b2)]  .Of  particular  in^ortemce  is  the  rela¬ 

tion  between  the  ill-conditioning  of  problem  1  and  the  singular  value  decom¬ 
position  (SVD)  of  and  how  the  two  ideas  are  combined  in  inversion  by 

filtered  SVO.  In  Section  3  local  ill-conditioning  of  the  nonlinear  phase 
retrieval  problems  is  described  in  terms  of  that  of  the  fFT;  and  it  is  con¬ 
trasted  with  the  global  ill-conditioning  due  to  the  possibility  of  multiple 
exact  solutions.  Section  4  presents  a  simple  generalization  of  the  GS  algo¬ 
rithm,  tailored  to  overcome  some  of  this  ill-conditioning;  together  with  a 
review  of  its  convergence  properties.  In  Section  S  extensions  of  this 
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2. 


THE  LINEAR  PROBLEM 


The  extrapolation  problem  of  example  1  requires  for  its  solution  inversion 
of  the  linear  integral  equation 

Since  A  and  B  are  bounded  sets  is  a  compact  lineeu:  integral 

operator  and  Eq.  (2.1)  is  a  Fredholm  integral  equation  of  the  first  kind. 
Inversion  of  such  an  equation  is  the  prototypical  linear  ill-posed  problem. 

The  ill-posed  natxire  of  the  problem  is  exposed  in  the  construction  of 
the  solution  using  the  singular  value  decomposition  (SVD)  of  P  ^P_.  As 

A  0 

0D 

outlined  in  Baker  117],  a  compact  linear  operator  has  an  SVD 

consisting  of  functions  and  and  nonnegative  real  numbers  with 

the  properties  that 

i.  ^''^i^i«l  complete  sets  of  orthonorroal  functions 

2  2 

Jor  L  (A)  and  L  (B) ,  respectively. 

ii.  a.  >0.  ^md  lima.  *0. 

1  1+1  ^  1 

iii.  P-^P-'i^.  -  o  t  .  (2.2) 

A  P  X  XX 

Expansion  of  G  and  P^g  as  sums  of  singular  functions 

G  -  S  bi'I'i  V  S 

gives  a  formal  solution  to  Eq.  (2.1)  by  equating  coefficients,  i.e., 

■  V  —  "A  ■  *1  • 
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r 


This  solution  illustrates  the  ill-conditioning  in  the  problem.  For  instance 
if  g  is  a  perturbation  of  g  such  that 


P^g  ■  2  with  X  (2.5) 

i 

then  it  is  possible  that  the  error  occurs  in  a  high  frequency  (large  i)  com¬ 
ponent  of  P  g,  so  that  a  perturbation  of  size  is  induced  in  G.  For 

fixed  z  this  perturbation  grows  arbitrarily  large  as  Thus  Eq.  (2.1) 

fails  condition  3  of  Hadamard's  definition  with  respect  to  data  perturbation 
in  g.  A  similar  argument,  but  one  that  is  rarely  made,  shows  that  the 
equation  is  also  ill-posed  with  respect  to  changes  in  the  model,  i.e.  perturba¬ 
tions  of  the  operator 

The  above  arguments  show  that  the  whole  solution  G  cannot  be  recovered 

to  within  any  specified  accuracy  given  uncertainty  in  the  data.  Therefore 

the  natural  question  to  ask  next  is:  "How  large  a  part  of  the  solution  can  be 

recovered  to  within  a  desired  accuracy  in  the  presence  of  noise?"  A  partial 

answer  lies  in  the  idea  of  the  dimznb^on  N(6,e^,e2)  of  the  problem; 

loosely  speaking  this  is  the  maximum  number  of  parameters  in  a  description  of 

the  solution  that  can  be  determined  to  within  an  accuracy  5,  given  errors 

e,  in  P.g  and  e_  in  P.^P*,-  (The  latter  error  includes  discretization 
X  A  2  As 

and  roundoff  errors  as  well  as  those  arising  from  an  imperfect  mathematical 
model  of  the  real  world.) 

A  more  precise  definition  of  the  essential  dimension  as  the  maximum 
dimension  of  any  subspace  U  for  which  the  associated  projection  P^G  of 


the  solution  cem  be  accurately  calculated  116,18]  shows  that  it  may  be 


expressed  in  terns  of  the  singular  values.  The  optimum  subspace  is  the  span 
of  the  first  N(6,C^,C2)  singular  functions.  This  analysis  suggests  that 
Eq.  (2.1)  be  solved  by  filtered  SVD.  This  algorithm  was  introduced  by  Hanson 
[19]  in  which  G  is  expressed  as  the  sum 

i 

where  p  is  a  filter  function  with  the  general  form 

^(a,o,Y)  ~  0  ^  for  large  0 

~  0  for  small  0 

and  Y  is  a  parameter  that  incorporates  knowledge  of  data  errors  and  the 
desired  solution  accuracy. 

Filters  come  in  many  forms.  For  instance,  if  the  error  in  g  is  such 
that  llP^g  -  P^gll  <  the  model  error  is  negligible  and  the  projection 

PyG  is  to  be  determined  to  within  an  accuracy  6  (i.e.  HP^G  -  P^GlI  <  5) ;  then 

the  filter  associated  with  the  essential  dimension  is 


(2.6) 


(2.7) 


?  (a,0,Y)  • 

•  0 


(2.8) 


^  •  if  0  >  Y 
0  <  Y 

where  Y  ■  Ej^6  *.  This  cutoff  filter  is  a  special  case  (q'*’*)  of  the  class 
of  filters 


t-1 


p(a,0,Y) 


(2.9) 


The  utility  of  the  theory  of  the  essential  dimension  lies  in  its  ability 
to  predict  the  size  and  general  form  of  components  of  the  solution  that  may  be 
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accurately  recovered  in  the  presence  of  noise,  before  numerical  calculations 
are  undertaken.  Such  predictions  are  therefore  most  useful  in  deciding  on 
the  size  and  form  of  an  appropriate  discretization  for  use  in  numerical 
solutions.  The  theory  is  easily  applied  to  transform  recovery  problems  due  to 
the  large  body  of  knowledge  about  the  SVD  of  the  finite  Fourier  transform 
(fFT)  developed  by  L<mdau,  Poliak,  Slepian  and  Walom  [20-26].  The  next 
theorem  gives  a  brief  summary  of  those  results  that  are  most  useful  in  the 
present  problem. 


N 

Tlic.o-’icjn  7.  i.  If  D  and  E  are  bounded  subsets  of  IP  with  volumes 
|d|  and  |e|,  and  surface  areas  |3o|  and  |3e|,  then,  for  leurge  c,  the 
number  n(c,a)  of  singular  values  of  greater  than  a  is  given 

'approximately  by 

n(c,a)  ^  [d]  •  |e| -  y1  9d1  1  log (a*^  -  1) log  c  +  o(c^^”^)  (2.10) 


where  cD  is  the  set  {cd:  dCo}  and  y  is  a  constant  independent  of  c,D 
and  E. 

ii.  In  one  dimension  let  be  the  n-th  singular  value  of  Then 

if  b  is  fixed,  c  arbitrary  and  n  is  determined  by 


n 


4c^  +  log  (2c 

TT 


(2.11) 


where  [a]  denotes  the  nearest  integer  to  a,  then 


(1  + 


eb,-l/2 


lim  o 


(2.12) 


iii.  The  singular  functions  of  ^  are  the  eigenfunctions  of  the 

c 

Stunn-Liouville  equation 

((l-t^)(J)')  ’  +  (X -4Tr^c\^)(|)  =  0  (2.13) 

where  solutions  are  required  to  be  uniformly  bounded  over  the  entire  real  line. 

The  theorem  indicates  that  the  singular  values  0  of  have  a 

steplike  distribution;  small  n,  decays  exponentially  for  large 

n,  and  the  change  from  to  e;q)onential  decay  occurs  over  an  interval  of 

2N“  2  2 

width  proportional  to  c  log  c  centered  on  c'^|d1*|e|.  This  in  turn 

implies  that  the  essential  dimension  is  approximately  c  |d|*|e|  and  is 

almost  independent  of  noise:  The  exponential  decay  of  in^lies  that  the 

essential  dimension  NsN(6,e^,e2)  is  determined  by  an  equation  of  the  form 
"N  **1 

e  —  so  that  the  noise  level  must  be  reduced  by  a  multiplicative 

factor  to  give  an  additive  increase  in  N. 

A  second  consequence  is  that  the  low  order  singular  functions  are  solu¬ 
tions  to  a  Sturm-Liouvi lie  problem  and  therefore  are  analytic  and  slowly 
varying.  Thus  they  will  be  well  approximated  by  a  discretization  based  on 
smooth  functions.  This  was  experimentally  verified  in  [14] ,  where  a  number 

of  different  discretizations  of  9  were  calculated  for  varying  values  of  c. 

o 

The  results  indicated  that  discretizations  based  on  Gaussiem  quadrature  or 
Galerkin  approximations  using  Legendre  polynomials  required  only  N  +  0(log  N) 
parameters  to  accxirately  approximate  the  first  N  singular  functions  of 
In  contrast  Galerkin  approximations  based  on  piecewise  constant  or  trigono¬ 
metric  functions  appeared  to  require  at  least  oiN  pairameters  to  achieve  the 


sane  accuracy,  where  a >3.  The  difference  may  be  e}q>lained  by  noting  that 
the  expansion  of  an  emalytic  function  on  [-c,c]  in  terns  of .Legendre  poly¬ 
nomials  will  have  rapidly  decaying  coefficients;  whereas  an  expansion  in  terms 
of  piecewise  continuous  f\inctions  will  converge  only  slowly.  Moreover  although 
trigonometric  functions  are  themselves  smooth,  they  do  not  approximate  the 
original  function  but  rather  a  periodic  extension  of  it  outside  of  [-c,c]. 

This  extension  is  likely  to  have  discontinuities  at  the  endpoints  ±c,  so  that 
its  Fourier  series  is  slowly  convergent.  This  was  observed  in  approximations 
of  the  singular  functions  of  ^  where  they  had  the  worst  performance  of  the 
discretizetions  examined;  therefore  their  use  is  not  recommended  in  transform 
recovery . 

To  conclude  the  section  an  example  of  solution  of  problem  1  by  filtered 
SVD  is  presented,  see  [14]  for  the  details  and  other  examples.  The  equation 

«  (P  q)  (V)  +  eu(v)  (2.14) 

c  c 

was  solved  numerically,  where 

g(v)-2(5i^)^  .  (2.15) 

This  is  the  point  spread  function  of  an  infocus,  aberration  free  slit  aperture. 
The  presence  of  noise  was  simulated  by  the  term  ey(v)  with  u(v)  a  random 
variadsle  uniformly  distributed  over  [-1,1]  and  e  a  control  of  the  magni¬ 
tude  of  the  noise.  The  parameter  values  c*l  and  .03  were  chosen,  a 
Galerkin  approximation  based  on  80  piecewise  constant  functions  was  used  to 
discretize  the  problem,  and  the  resulting  finite  system  was  solved  by  filtered 
SVD  using  the  filter  of  Eg.  (2.9)  with  q*2  and  y*  .Ol* 


Figure  1  shows  two  approximate  solutions  G  calculated  from  noise  data 
along  with  the  true  solution  for  noiseless  data 

G(a))  =  2(1  -  |a)|)  .  (2.16) 

Figure  2  shows  the  extrapolation  4^’G  of  one  particular  perturbation  P  g. 

c 

Because  of  the  symmetry  of  the  test  functions  each  graph  is  for  negative 
values  of  the  argument  only. 

The  graphs  show  that  G  is  a  good  approximation  to  the  true  solution  G, 
except  at  the  origin  where  the  smooth  singular  functions  C2uinot  reconstruct 
the  discontinuity  in  slope.  Since  this  discontinuity  dominates  the  far  field 
behavior  of  .^G,  the  extrapolation  is  not  as  accurate  as  the  reconstruction. 


3.  SOURCES  OF  ILL-CONDITIONING  IN  PHASE  RETRIEVAL 


The  previous  discussion  of  ill-conditioning  in  the  linear  problem  gives 
new  insight  into  why  the  nonlinear  problem  of  phase  retrieval  is  ill-posed. 
Previous  examinations  of  the  problem  have  concentrated  on  showing  that  the 
problem  is  ill-posed  due  to  violations  of  conditions  1  and  2  of  Hadamard's. 

That  it  is  also  ill-posed  due  to  violations  of  condition  3,  and  the  implica¬ 
tions  such  violations  have  for  numerical  solutions,  has  not  been  noted  previous 
to  [15].  We  therefore  present  a  brief  summary  of  all  three  possible  sources  of 
ill-conditioning  and  their  effects  on  the  behavior  of  algorithms  for  numerical 
solution  of  such  problems. 

Violations  of  condition  1  are  not  in  themselves  important  for  the  following 
reason.  Phase  retrieval  is  a  model  of  a  real  world  phenomenon  known  to  exist. 
Therefore  failure  of  the  model  to  have  a  solution  does  not  imply  that  the  real 
physical  quantity  does  not  exist;  so  nonexistence  must  be  due  either  to  an 
inaccurate  model  or  to  noisy  data.  Both  of  these  possibilities  are  simply 
extreme  examples  of  discontinuous  dependence  of  the  solution  on  the  data; 
therefore  violations  of  condition  1  are  subsumed  under  violations  of 
condition  3. 

Nonuniqueness,  however,  is  an  important  source  of  ill-conditioning.  The 
precise  form  of  all  possible  solutions  to  the  one-dimensional  (1-D)  phase 
problem  appears  to  have  been  first  determined  by  Akutowicz  [27,28]  and  has 
been  independently  rediscovered  by  a  number  of  other  authors,  e.g.  [29,30]. 
Their  results  imply  that  the  phase  problem  in  1-D  has  uncountably  many 


solutions.  In  131],  see  also  115],  we  show  that  these  results  may  be  extended 
to  two  (and  higher)  dimensions  to  give  necessary  conditions  on  the  form  of 
multiple  solutions  for  higher  dimensions;  these  conditions  imply  that  multiple 
solutions  are  significantly  less  likely  than  in  1-D. 

The  1-D  results  depend  on  noting  that  any  solution  g(v)  is  an  analytic 
function  of  exponential  growth;  this  follows  from  the  fact  that  g  is  the 
trcuisform  of  a  function  G  with  bounded  support  and  the  Paley-Wiener  theorem. 
Therefore  g(v)  has  a  Hadamard  factorization  132]  of  the  form 

g(v)  =  |g(0)  n  /l  -  —  j  (3.1) 

k=l  '  \  ' 

f 

where  a  and  6  are  real  constants  euid  counteUoly  many 

zeroes  of  g.  Then  2my  other  solution  must  be  of  the  form 

g(v)  =  B(v)g(v)  O,  B  €  3R  (3.2) 

where  B(v)  is  a  finite  or  infinite  product  of  Blaschke  factors,  i.e. 

**  v-v* 

B(v)  *  n  B,  (v)  where  B.  (v)  *  -  .  (3.3) 

1=1  *^1  ^ 

Furthermore  if  6  =  0  then  any  g  given  by  Eq.  (3.2)  is  indeed  a  solution. 

Thus  alternative  exact  solutions  are  basically  generated  by  "flipping"  zeroes 
of  g(v)  to  their  complex  conjugates.  Since  there  are  an  infinite  niunber  of 
zeroes  there  are  also  an  infinite  number  of  exact  solutions  to  a  1-D  phase 
retrieval  problem. 
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If  the  exact  solutions  were  few  and  well  separated,  and  the  phase 

retrieval  problem  well-posed  in  a  neighborhood  of  each  zero,  then  any 

standard  numerical  algorithm  would  perform  satisfactorily.  However,  as  noted 

N 

by  Napier  133] ,  any  N  zeroes  may  be  flipped  or  not  flipped  in  2  different 

N 

combinations,  giving  2  different  solutions.  Therefore  there  is  a  veri' 

large  n\jmber  of  possible  solutions,  in  fact  an  uncoiintable  infinity  of  such 

solutions.  Furthermore,  for  reasonable  functions  G{uj),  any  infinite  product 

B(z)  of  Blaschke  factors  will  converge,  i.e.  B(z)  =  lim  B.,(z)  where  B..(2) 

N  N 

N-K30 

is  a  product  of  N  Blaschke  factors.  Since  each  finite  product  corresponds 
to  a  possible  solution,  it  follows  that  the  set  of  solutions  has  limit  points. 
Any  numC-fuCjcUL  aJtgcfiitkm  iciti  have  QhJtaZ  di^i-icutty  In  the  neighbonhcod  05 
6uch  potnti. 

However  the  situation  improves  markedly  in  N^2  dimensions.  An  exten¬ 
sion  of  the  arguments  for  N  = 1  shows  that  the  zeroes  of  any  alternative 
solution  g  are  the  zeroes,  or  complex  conjugates  of  the  zeroes,  of  g.  But, 
as  g  is  an  entire  function  of  N  complex  variables,  its  zeroes  form  an 
analytic  set  X  of  dimension  N-1  134].  This  set  X  is  essentially  the 

union  of  M  connected,  N-1  dimensional,  analytic  manifolds  X^.  Therefore, 
if  part  of  a  manifold  Xj^  is  flipped  to  form  the  zeroes  of  g,  all  of  X^ 

must  be  flipped  to  ensure  that  the  zeroes  of  g  also  form  an  analytic  set 

P  M-P 

X;  so  X  will  be  of  the  form  U  X.  U  U  X*  .  But  in  two  or  more 

k=l  £*1 

dimensions  the  set  X  is  likely  to  be  irreducible,  i.e.  M=l,  in  the  same 
way  that  almost  all  polynomials  of  two  or  more  variables  are  irreducible. 
Therefore  at  most  two  possible  solutions  with  zeroes  X  and  X*  can  be 


formed,  and  the  solution  is  essentially  unique. 

Therefore,  in  the  simplest  model  problem,  ill-conditioning  due  to  non¬ 
uniqueness  is  likely  to  be  a  severe  problem  in  one-dimensional  problems,  or 
in  problems  that  are  essentially  one-dimensional  (e.g.  those  with  radial 
symmetry'  considered  in  135]);  but  in  higher  dimensions  it  should  have  signi¬ 
ficantly  less  effect.  The  situation  is  less  well  understood  if  side  condi¬ 
tions  are  imposed.  For  instance  if  both  g  and  G  are  analytic  then  the 
solution  of  example  2a  is  essentially  \mique,  but  for  arbitrary  G,  nonunique 
solutions  have  been  constructed  (see  [11]  for  a  review).  Even  less  is  known 
£dDout  the  effect  of  positivity. 

However,  it  should  be  noted  that  for  all  three  problems  there  can  be 
parasitic  solutions  due  to  symmetries,  etc.  For  example,  if  G  vanishes 
outside  of  the  internal  dl  where  d<c  then  translations  G  of  G  will 
still  be  within  cl  and  will  have  transforms  g  which  have  the  same  modulus 
as  g  over  cl.  , 

Finally  we  show  that  example  2  is  locally  ill-conditioned  in  that  it 
violates  condition  3.  The  phase  retrieval  problem  may  be  recast  as  requiring 
the  solution  of  the  nonlinear  equation 

ir(G)  *  m  Se-.  L^(B)-^L^(A)  (3.4) 

where  S’  is  the  composite  operator  ^md 

1  A  B 

^^(g)  =  |g|  L^(A)  -L^(A)  .  (3.5) 

Since  is  a  bounded  continuous  operator  and  compact,  S  is 


compact.  This  implies  that  for  any  e  >  0  an  infinite  sequence  of  functions 

{g.  }  can  be  found,  such  that 
1 

IlG.  -  G.  II  >  1  -  6.  ;  but  IliZ’CG.  )  -  5?(G.  )  II  <  E  .  (3.6) 

131  13 

Nonuniqueness  led  to  global  ill-conditioning,  in  that  there  are  regions 
in  which  many  exact  solutions  exist.  Compactness  leads  to  local  ill- 
conditioning,  in  that  in  the  neighborhood  of  an  exact  solution  there  are 
directions  H  in  which  a  change  in  the  solution  G  induces  a  negligibly 
small  change  in  the  observation  m.  Therefore  although  i?(G)  and  i?(G+H) 
are  distinct  in  theory  in  practice,  with  the  presence  of  measurement  noise, 
they  are  indistinguishable. 

This  local  ill-conditioning  may  be  partially  quantified  by  noting  that 
compactness  is  due  to  the  operator  that  inversion  of  ^  involves 

inversion  of  P,^P„*  Therefore  the  theory  of  the  essential  dimension  and  the 
idea  of  filtered  inversion  outlined  in  the  previous  section  suggest  that  the 
essential  dimension  of  the  phase  problem  (the  number  of  COmptZX  parameters 
that  may  be  accurately  determined)  is  approximately  bounded  above  by 
N=  (a2“a^)  •  (bj^-b2)  ,  is  relatively  independent  of  the  noise  level  in  m,  emd 
that  the  solution  space  should  be  restricted  by  filtering  to  the  span  of  the 
first  N  singular  functions  of  P.^P*,-  Furthermore  the  discretization 
chosen  for  the  problem  should  accurately  approximate  these  singular  values. 

The  local  ill-conditioning  of  examples  2a  and  2b  is  less  well  under¬ 
stood.  However  it  is  reasonable  to  suppose  that  the  total  amount  of  in¬ 
formation  available  in  all  the  constraints  is  less  than  the  maximum  amount  of 


information  in  each  constraint  considered  separately.  Therefore,  if  G  is 
represented  by  2P  real  parameters,  in  example  2a  P  of  these  are  deter¬ 
mined  by  knowledge  of  |g|,  and  up  to  2N  by  knowledge  of  |g|,  giving  an 
upper  bound  on  the  essential  dimension  of  P  + 2N,  In  example  2b  the  upper 
bound  is  P  +  N;  in  this  case  the  condition  that  G  be  real  places  symmetry 
constraints  on  g,  effectively  halving  the  amount  of  information  available 
in  m.  Again  these  results  suggest  that  nijmerical  solutions  be  constructed 
in  a  similar  fashion  as  solutions  to  the  linear  problem. 
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k.  ITERATED  PROJECTION  ALGORITHMS 

The  previous  sections  showed  that  transform  recovery  problems  are  locally 
ill-conditioned,  which  confirms  the  practical  experience  of  a  nianber  of 
authors  [1,8,36]  who  noted  very  slow  convergence  rates  of  the  Gerschberg- 
Saxton  (GS)  algorithm.  Therefore,  in  order  to  modify  the  algorithm  to  cope 
with  this  ill-conditioning,  we  place  GS  in  a  more  general  setting  by  viewing 
it  as  a  special  case  of  finding  a  common  intersection  point  of  a  collection 
of  sets.  In  formal  language,  it  is: 

Given  the  sets  {t.}^_  with  associated  projections  P.  =P 

find  G  such  that  G€  fl  T. 

i=l  ^ 

Gubin,  Polyak  and  Raik  [37]  have  proposed  am  iterative  algorithm  for  the- 
solution  of  such  problems  in  which  at  the  n-th  stage  generated  by 

G  -  *  P.G  where  i  *  (n-l)mod  M  +  1  (4.1) 

n+1  1  n 

and  proved  that  under  certain  conditions,  such  as  the  convexity  of  the  sets 
T^ ,  the  iterates  converged  to  a  common  intersection  point  if  one  existed.  A 
survey  of  this  and  similar  algorithms  appearing  in  the  Russian  literature  is 
given  by  Censor  and  Herman  in  [38].  If  the  set  S  is  defined  to  be 

S  =  {g€L^(3R):  g(v)  =  g(v)  for  v  G  a}  (4.2) 

for  example  1,  or 

S  =  {g£L^(3R):  lg(v)  |  *  m(v)  for  v  €  a}  (4.3) 

for  example  2 ,  and  the  sets  T^  and  T  defined  as 


=  {GCL^dR):  G  »  ^~^g.  g€s}  (4.4) 

T„  =  {GCL^Cro):  G(ijj)  =  0  for  cofs}  (4.5) 

2 

then,  as  the  Fourier  transform  preserves  L  norms 

h  «  Pgg  H  *  ^‘^h  =  Pj^(^"^g)  «  P^G  (4.6) 


and  GS  applied  to  examples  1  and  2  is  recognizable  as  the  iterated  projection 
algorithm  of  Eq.  (4.1)  with  M*2. 

The  advantages  of  the  iterated  projection  algorithm  are:  First  that  it 
allows  easy  incorporation  of  extra  constraints  such  as  those  in  examples  2a 
and  2b  by  setting  M  =  3  and  adding  either  the  set 

T2  =  {G€L^(1R):  G(u)  >0  for  U)£b}  (4.7) 

T2  =  {G€L^(3R):  1g(w)1  =  n  (w)  for  U)  £  B}  .  (4.8) 

Second  that  in  transform  recovery  problems  the  projections  P^  may  be  very 
easily  computed.  The  disadvantage  is  that,  as  stated,  the  algorithm  is 
sensitive  to  local  ill- conditioning.  Figure  3  shows  two  instances  of  the 
effects  of  ill-conditioning;  in  the  first  the  sets  intersect 
at  a  very  acute  angles  so  that  the  projections  are  very  slowly  convergent, 
and  in  the  second  the  presence  of  noise  has  perturbed  the  two  sets  so  that 
an  intersection  point  does  not  exist.  These  possibilities,  and  the 
additional  fact  that  T^  is  not  convex  in  phase  retrieval  problems,  iitply 
that  the  iterated  projection  algorithm  will  either  be  very  slowly  convergent 
or  fail  to  converge  at  all. 


2 


In  order  to  escape  these  difficulties  we  propose  that  the  original 


problem  be  replaced  by 

M  2 

Find  G  to  minimize  F(G)  =  ^  )Ig-P.GI1 

i*l  ^ 

Obviously  F(G)  ^0  and  F(G)  =0  iff  G  is  a  common  intersection  point,  but 
even  if  such  a  point  does  not  exist  due  to  perturbations  of  the  sets  by  noise 
in  the  data, the  G  that  minimizes  F(G)  is  an  acceptable  pseudo-solution  to 
the  problem.  We  also  propose  the  following  extension  of  the  iterative  pro¬ 
jection  algorithm  for  minimization  of  F(G)  in  which  at  the  n-th  iteration 


n+1 


(M- 


,  M-1 

^  £  (P.G  -G  ) 

1-1)  in  n 


G  *  P«(G  +A  H  ) 
n+1  M  n  n  n 


(4.9) 


where 

A„€(0,2)  if  T„  convex 

”  ”  .  (4.10 

€  (0,1)  otherwise 

This  algorithm  will  be  termed  the  pA,OJZCtcon  (RP)  aZgofuMitn  from 

hereon. 

RP  has  a  number  of  useful  features,  in  particular  it  produces  constantly 
decreasing  residuals  as  does  GS  [1] ,  which  are  summarized  in  the  following 
Theorem,  proved  in  [15]. 

ThtO^vejn  4.1.  If  the  projections  P^G  are  unique  and  continuous  in  G 
then  at  the  n-th  iteration  the  iterates  G  of  Eq.  (4.9)  satisfy 
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Vk  >  1  , 


(4.11 


or 


n+k 


Furthermore  if  G  is  a  limit  point  of  the  iterates  then  G  is  a  fixed  point 
of  the  iteration. 

RP  can  also  be  recast  as  one  of  the  standard  optimization  algorithms  as 
the  next  theorem  from  [15]  shows. 


The.OA.zm  4.2.  The  gradient  Vf(G)  of  F(G)  is 

M 


VF(G)  =  2  ^  (G-P.G) 

i=l  ^ 


(4.12 


Therefore  if  ^  then  RP  is  the  standard  steepest  descent 


algorithm  with  variable  steplength  AC  (0,  ^r-r)  . 


We  have  presented  the  algorithm  in  a  form  in  which  iterates  are 


restricted  to  the  particular  set  T„.  This  includes  the  tinrestricted  case 


(T„  =  L  (3R) )  ,  but  also  allows  from  knowledge  about  the  solution  G  to  be 
M 


reimposed  at  each  iteration  after  less  well-known  requirements  are  satisfied 


by  moving  in  the  search  direction.  However,  the  chief  benefit  of  the 


restriction  in  transform  recovery  problems  with  the  set  T„  of  Eq.  (4.5)  is 

n 


that  the  resulting  algorithms  are  e^^-iciznt.  That  is  at  each  iteration  they 
require  function  values  of  G  wd  only  across  the  intervals  A  and 

B.  Efficiency  has  not  always  been  achieved,  for  instance  the  version  of  GS 


proposed  by  Papoulis  [2]  for  inversion  of  the  fFT  requires  knowledge  of  G 


across  the  entire  line  at  each  iteration. 


To  demonstrate  that  transform  recovery  problems  are  efficient  we  first 
note  that  if  T  is  a  linear  sxibspace  then  the  projection  P  is  linear  and 


M  n  n 


«n*5ri  .i,  W 

1*1 


SO  that  the  search  direction  is  independent  for  X,  and  for  any  X  the  new 
iterate  is  still  in  T„.  Now  for  convenience  let  A =  B *  cl  so  that  P„ * P 


P  P,G  =  P  ^ 
cl  c  S 


P^g  is  easily  shown  to  be 


m  (v)  e 


i  arg  g (v) 


V  £  cl 


otherwise 


so  that 


V  ■  ’-V-P/sPi.g 


Moreover  for  example  2a 


(P  P.G) (u) 
c  ^ 


,  ,  i  arg  G(u))  -  _ 

n  (w)  e  ^  ,  u>  €  cl 


,  otherwise 


and  for  example  2b 


(P  P-G) (u) 
C  A 


(Re  G)  (u)  ,  if  (Re  G)  (u)  >  0  and  u  €  cl 


otherwise 


It  is  obvious  that  calculation  of  P  P.G  requires  only  values  of  G  and 

c  1 


g*  ^G  on  cl,  so  that  RP  is  efficient. 


We  conclude  this  section  by  noting  that  the  relationship  between  RP  and 

gradient  method  opens  up  new  possibilities  for  improvement  of  RP.  In 

particular,  as  noted  in  Eq.  (4.13),  if  T^.  is  linear  then  the  search 

n 

direction  is  independent  of  X,  therefore  it  is  possible  to  do  a  line  search 
in  the  direction  of  H^.  By  Eq.  (4.12) 


n 


+  Xh  ) 
n 


VF(G  +Xh  ,H  ) 
n  n  n 


(G  +  XH  -  P  .  (G  +Xh  )  ,H  ) 
n  n  1  n  n  n 


(4.20) 


Since  calculation  of  F(G)  requires  values  of  calculation  of 

F(G^  +  Xh^)  at  any  point  in  a  lin®  search  gives  sufficient  information  for 

calculation  of  the  gradient  at  that  point.  Therefore  a  line  search  algorithm 

using^erivative  information,  such  as  Powell's  cubic  line  search  algorithm 

(39]  ,  may  be  used  for  the  same  cost  as  a  stamdard  quadratic  line  search  that 

2 

uses  values  of  F<G)  only.  Since  in  phase  retrieval  T  (cl),  cubic  line 


searches  may  be  profitably  eitployed. 


5.  ALGORITHMS  BASED  ON  AFFINE  APPROXIMATIONS 


The  previous  section  showed  that  the  G5  algorithm  could  be  extended  to 
an  algorithm  RP  that  mitigated  some  of  the  effects  of  ill-conditioning  in 
transform  recovery.  However  the  extension  does  not  remove  all  of  these 
effects  as,  in  particular,  it  performs  no  filtering  to  restrict  the  solution 
to  a  well-posed  solution  set.  Moreover  as  RP  is  a  variant  of  steepest 
descent,  it  is  only  of  first  order  and  therefore  will  not  perform  well  even 
on  some  well-posed  problems  for  the  same  reasons  that  gradient  algorithms 
perform  poorly  on  some  standard  optimization  problems.  Therefore  we  seek  a 
solution  to  both  these  problems  by  proposing  a  new  class  of  iterative  algo¬ 
rithms  based  on  more  accurate  affine  approximations  to  the  sets  T^  or  the 
functions  F(G)  and  At  each  iteration  these  algorithms  require  the 

solution  of  an  ill-posed  linear  subproblem  similar  to  that  discussed  in 
Section  2;  this  may  be  done  using  filtered  SVD  thus  further  reducing  ill- 
conditioning  in  phase  retrieval. 

We  start  with  a  simple  example  of  how  such  algorithms  may  be  constructed. 
Zn  RP  the  search  direction  may  be  viewed  as  being  obtained  by  replacing 

the  sets  T^  by  point  approximations  the  n-th  iteration,  and  then 

solving  the  subproblem 

M 

Minimize  F  (G)  »  IlG-P.G  .  (5.1) 

n  "  in 

1*1 

In  particular  the  set  S  is  approximated  by  the  point  P_g  .  However  the 

S  n 

only  restrictions  on  functions  in  S  is  that  they  have  value  g  or  modulus 
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m  over  the  interval  A;  outside  of  A  they  may  take  on  arbitrary  values. 
Therefore,  if  A = B  =  cl ,  the  affine  subspace 


s  =  {h;  h  =  P  <g-P  g),g6s} 

f)  w  s  n  c 


(5.2) 


contains  the  point  P_q  and  is  contained  in  the  set  S.  If  it  is  used  to 

S'n 

replace  the  point  approximation  P^^j^  "^In  *  ^  ^^n  n-th 

iteration,  then  the  new  subproblem  to  be  solved  is 


The  minimum  satisfies  the  normal  equations 


c  c  c  S  n  ^  ^ 


(5.4) 


giving  a  search  direction  If  M>2  then  this  subproblem  is 

well-posed  as  (^*  ^  +  (M-1)^  has  a  bounded  inverse;  but  if  M*2  then 
c  c 

the  problem  is  inversion  of  the  fFT  ^  recast  as  a  linear  least  squares 
problem.  This  is  an  ill-posed  problem  best  solved  by  filtered  SVD  as 
described  in  Section  2;  but  as  the  same  linear  operator  appears  at  each 
iteration  the  SVD  need  be  calculated  only  once. 

A  more  accurate  approximation  arises  from  replacing  the  projections 
P^G  by  the  linear  approximation 

P.G~P.G  +  (G  ) {G-G  )  (5.5) 

1  in  in  n 


at  the  n-th  iteration,  where  is  the  Frechet  derivative  of  the  operator 

P..  We  assume  that  ,'9C.  exists  and  is  a  bounded  linear  operator;  conditions 


on  the  sets  that  would  guarantee  these  properties  are  q\iite  complicated 

(e.g.  [40])  and  are  beyond  the  scope  of  this  work.  However  if  possesses 

these  properties  then  it  is  symmetric,  is  the  Hessian  (i.e.  the 

2 

second  Frechet  derivative)  of  the  fxinction  F^  (G)  =  IlG-P^GlI  and  for  every 
G 

(./-jr^(G))  (G -P^G)  =  0  .  (5.6) 

If  this  approximation  is  used  at  the  n-th  iteration  the  corresponding 
subproblem  to  be  solved  is 

M-1 

minimize  F  (G)  =  V  llG  -  P .  G  -  (G  )  (G  -  G  )  II  .  (5.7) 

n  “W  in  in  n 

1=1 

The  function  L  that  solves  this  problem  is  by  definition  the  least  squares 
solution  to  the  block  system 


Since  the  original  transform  recovery  problem  was  ill-posed,  this  block 

system  is  also  usually  ill-posed  and  should  be  inverted  by  filtered  SVD. 

However  the  cost  of  calculation  of  the  SVD  of  the  entire  block  matrix  at 

each  iteration  would  very  expensive.  Therefore  we  propose  that  the  SVD  of 

each  block  (G  )  be  calculated  and  filtered  separately  to  give  an 

1  n 

approximate  filtering  of  the  whole  matrix.  As  we  shall  see  later  in  the 
section  such  filtering  can  be  done  relatively  cheaply. 
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The  third  set  of  algorithms  described  here  are  based  on  the  following 


giving  the  standard  Newton  search  direction  However,  for  phase 

retrieval  problems,  in  order  that  the  resulting  algorithm  be  efficient  in 
the  sense  of  the  previous  section  we  replace  Eq.  (5.12)  by  the  restricted 
equation 

P  V^F(G  )P  L  =  -P  VF(G  )  (5.13 

c  n  c  c  n 

so  that  values  of  iterates  are  required  only  over  the  interval  cl. 


Equations  (5.12)  and  (5.13)  are  normally  ill-posed  and  therefore  best 

solved  by  filtered  SVD.  As  the  fFT,  with  its  exponentially  decaying  singular 

2 

values,  underlies  the  Hessian  V  F(G)  we  propose  a  cutoff  filter  of  the  forr’. 

f(X)  =  if  lXl>£^ 

=  0  otherwise  (5.14) 

where  dependent  on  the  noise  levels  and  desired  accuracy.  This  should 

produce  a  search  direction  in  which  F(G)  should  vary  moderately  rapidly 

as  directions  corresponding  to  small  eigenvalues,  and  thus  slowly  varying  F, 

have  been  filtered  out.  However  the  resulting  direction  is  not  necessarily  a 

2 

descent  direction,  as  for  some  G,  V  F(G)  will  have  negative  eigenvalues; 
but  this  may  be  corrected  by  use  of  the  filter 

f(X)  =  X"^  if  X^e^ 

»  0  otherwise  .  (5.15) 

Newton's  algorithm  in  which  Eq.  (5.13)  is  inverted  with  the  filter  of  Eq. 

(5.14)  shall  be  denoted  by  FN;  if  the  filter  of  Eq.  (5.15)  is  used  it  will 

be  denoted  by  FNP. 

2 

Since  V  F(G  )  changes  with  each  iteration  FN  and  FNP  incur  con- 
n 

siderable  costs  in  calculation  of  a  new  SVD  at  each  iteration.  Although  some 
savings  are  possible  by  using  the  old  SVD  as  a  first  approximation  to  the 
new  SVD  with  optional  iterative  refinement,  we  instead  sought  to  reduce 
Eq.  (5.13)  to  the  block  form  of  Eq.  (5.8)  so  that  block  filtering  could  be 
applied.  This  may  be  done  by  noting  that  if  is  a  symmetric  positive 
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definite  square  root  (./-  J(^)  which  by  Eq,  (5.6)  satisfies 


(G) )  (G -P.G)  =  G-P.G 
111 


(5.16) 


Therefore  Eq.  (5.13) 
M-1 


r  M-i  1  M-i 

ZP  (./- Jl!'  (G„)  )P  L  =  -  52  (G  -P  P.G 

L  1=1  ^  'in  cj  ^  n  c  1  n 


(5.17) 


may  be  rewritten  as 


Li=i  -I  1=1 


(5.18) 


which  in  turn  is  recognizable  as  the  normal  equations  for  the  least  squares 
solution  of  the  block  system 


’  (G^))^^^P 

In  c , 

L  =  - 

G  -P  P,G 
n  c  1  n 

G  -P  P„  ,G 

L  n  c  M-1  n  1 

(5.19) 


Block  filtering  was  chosen  because  of  the  simple  form  of  the  blocks  in 


Eq.  (5.19)  in  phase  retrieval  problems.  If  the  operators  P^  are  viewed  as 


acting  on  the  pair  of  real  functions  !!!*^!]  instead  of  on  the  complex 

Xin  n(u); 


valued  function  H((jo)  then  simple  calculations  show  that  Jf- may 


be  represented  as  a  block  diagonal  operator  whose  2^2  diagonal  blocks , 


are  indexed  by  the  variable  u).  For  example  2a 


f(X)  -  ,  if  \>z^ 

-  0  ,  otherwise  (5.2j) 

to  replace  the  entry  1  -  in  Eq.  (5.20)  by  f(l  -  )  .  Finally 

for  either  problem  the  original  block  (,/- JI'L  (G)  )P  is  reduced  in  size  by 

2  c 
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eliminating  from  the  block  those  equations  for  (Re  H)  (cu)  or 

sin  6  (Re  H)  (to)  -  cos  6  (Im  H)  (to)  that  correspond  to  eigenvalues  that  have  been 

filtered  to  zero. 

From  Eq.  (4. 15) 


P  (./- )P  does  not  possess  an  obvious  symmetric  square  root  however  Eq. 
c  1  c 

1/2 

(5.24)  suggests  the  asymmetric  square  root  (,/- JIfL  (g) )  This  composite 

©  c 

operator  may  be  approximately  filtered  by  filtering  each  component; 

may  be  filtered  as  was  (,/- example  2a  and  JT 
may  be  filtered  after  calculation  of  its  SVD  as  described  in  Section  2. 

Since  the  first  filtering  requires  a  trivial  calculation  at  each  iteration, 
and  ^  is  independent  of  the  iterates,  this  filtering  may  again  be  done 
cheaply. 

Therefore,  after  rows  corresponding  to  filtered  elements  have  been 
eliminated,  in  phase  retrieval  problems  we  are  left  with  a  reduced  block 
system 

P  P^  ^  G 

®  ^  "  (5.26) 

-  G 

2  n  n  J  f 
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to  solve  in  a  least  squares  sense.  This  can  be  done  using  a  standard 
routine  such  as  LLSQF  in  the  IMSL  library ,  or  further  advantaige  may  be 
taken  of  sparsity  within  the  system.  If  Eq.  (5.26)  is  solved  at  each 
iteration  with  the  filter  of  Eg.  (5.22)  resulting  algorithm  is  termed  SQFK; 
if  the  filter  of  Eq.  (5.23)  is  used  the  algorithm  is  SQFNP. 


6.  SOME  ILLUSTRATIVE  NUMERICAL  RESULTS 


To  conclude  the  paper  we  give  a  brief  accoxint  of  some  numerical  solutions 
to  examples  2,  2a  and  2b  iising  the  algorithms  proposed  in  the  previous 
sections.  Reference  is  made  to  our  two  previous  papers  for  detailed  numerical 
computations  of  a  wide  selection  of  problems.  The  measure  of  performance  of 
the  algorithms  was  taken  to  be  the  number  of  iterations  required  to  reduce  the 
function  below  a  prescribed  value.  The  ill-posed  nature  of  phase 

retrieval  problems  implies  that  this  is  not  the  best  of  measures  in  that  a 
small  value  T'(G^)  does  not  necessarily  imply  that  is  close  to  the  true 

solution  G.  However  without  knowledge  of  the  true  solution  we  have  none 
better.  Therefore,  because  of  this  ill-conditioning  and  the  limited  resources 
for  numerical  computation  at  our  disposal,  the  results  presented  here  are 
intended  as  a  guide  to  the  behavior  of  the  algorithms  on  real  problems  rather 
than  a  firm  prediction  of  their  likely  performance. 

We  begin  with  an  account  of  the  mechanics  of  the  computation.  First  is 
the  discretization:  it  was  determined  by  the  requirements  that  it  gives  an 
accurate,  efficient  approximation  to  and  that  the  discrete  projections 

be  easily  calculated.  Therefore  a  discretization  based  on  N  point  Gaussiam 
quadrature  was  chosen;  the  numerical  experiments  mentioned  in  Section  2 
showed  its  efficiency  wd  its  pointwise  nature  allows  easy  evaluation  of 
projections.  The  discrete  problem  thus  involves  determination  of  vectors 
g,  GtC  whose  elements  g^,  G^^  are  to  be  approximations  to  the  function 
values  gCPj^),  at  the  abscissae  Pj^  of  the  N-point  Gaussian  quadrature 
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rule  on  cl.  To  this  end  vectors  ni,  n  E  3R  are  formed  with  coR^onents 
nij^ » Tn(Pj^)  ,  where  m{v)  and  n{u))  are  the  known  moduli .  Then 

matrices  W,  FEC  are  constructed  with  W  being  a  diagonal  matrix  whose 


k-th  entry  is  the  weight  w  of  the  quadrature  rule,  and  F  having  entries 

2TTiPj^Pj^ 

.  The  discretized  example  problems  are  now. 


A  AAA 


Find  vectors  g,  G  such  that  g « FWG  and 


Examp le 

2: 

>^k 

1  * 

Example 

2a; 

l^k 

1  -  iGj  - 

Example 

2b: 

l^k 

1  * 

In 

order 

to  test  the  algorithms,  particular  exairples 

of  each  model 

problem 

were 

chosen.  Of  particular  interest,  in-as-far  as 

this  paper  is 

concerned,  is 

the 

test  function 

G{;.) 

m 

i2TW(-) 

c 

e 

(6 

GJ 

K(-) 

c 

- 

w  S  (-)  +  W  S  (-) 

3  3c  44c 

(6.2) 


G(cj)  is  the  pupil  function  of  a  slit  aperture  having  unit  amplitude  over  the 
exit  pupil  and  suffering  from  wavefront  aberrations  of  optimum  balanced  coma 
and  optimum  balanced  spherical  aberration  S^,  where  W^/X  and  W^/X 
are  the  dimensionless  aberration  strengths.  and  are  the  slit 

aperture  versions  [41]  of  the  Zemike  polynomials.  This  test  function  was 


used  previously  in  [9].  In  reference  [9],  the  inversion  assumed  that  the 
amplitude  distribution  over  the  exit  pupil  was  unity;  in  the  present  case 


neither  the  amplitude  distribution  or  wavefront  are  knovm  a.  pfiLofu..  The 


numerical  calculations  summarised  here  were  carried  out  for  K,  ■W.  b3X/8. 

We  now  outline  the  basic  structure  of  all  the  numerical  test  runs  before 
considering  various  points  in  detail.  All  tests  consisted  of  the  four 
essential  steps: 


1. 

Choose  an 

initial  guess 

^0- 

2. 

Given  iterates  g  and 

’n 

A 

G 

n 

compute  a  search  direction  H^. 

3. 

Calculate 

a  steplength 

A 

n 

and  new  iterates 

n 


n+1 


G  +A  H  , 
n  n  n 


g  ^  *  FWG 

^n+1  n+1 


4.  Iterate  steps  2  and  3  until  the  convergence  critiria  are  satisfied. 
The  convergence  criteria  used  in  all  calculations  were 


F(G  )  <  10 
n 


-5 


or  ^  II A  .H  .11  <2x10 

n-k  n“K 

k»0 


-3 


(6.3) 


together  with  an  upper  limit  N  on  the  number  of  iterations.  The  sum  of 

n&x 

y 

the  last  three  steplengths,  rather  than  DA  H  II  alone,  was  chosen  as  in  ill- 

n  n 

conditioned  minimization  problems  "stop-start”  behavior  is  often  noticed. 

That  is  a  large  step  often  followed  one  or  two  small  steps,  after  which 
another  large  step  is  taken.  This  feature  was  often  observed  in  the  use  of 
affine  approximation  algorithms,  differences  in  magnitude  of  successive 
steplengths  by  factors  greater  than  100  occurred  fairly  frequently. 

Three  different  forms  of  initial  guc^s  were  used: 


1.  Gq-0, 


2.  G 


r\t 


(l-e.) (l+c,r-)G:  , 


Guess  2  is  a  damped  pertuxbation  of  the  true  solution  G  with  repre¬ 

senting  the  noise  level.  For  convenience  we  shall  express  this  level  as  a 
percentage,  e.g.  e^>.2  will  be  described  as  e^>20%  noise.  The  variable 
rj2^  is  a  random  complex  varied^le  with  modulus  uniformly  distributed  over  [0,1] 
and  phase  uniformly  distributed  over  (0,2'n].  Guess  3  represents  a  small 
totally  random  perturbation  about  the  origin;  again  for  convenience  such 
guesses  shall  be  denoted  by  e^>100%. 

The  first  results  reported  are  those  on  determination  of  an  optimal 
choice  of  Three  possibilities  were  considered: 


2 

2.  \  =x  where  r  is  a  random  variable  uniformly  distributed  over 

n  n  n 

[0,2]. 

3.  is  the  approximate  minimm  of  F(G  -fX  H  )  as  a  function  of  X 

n  "  n  n  n 

determined  by  Powell's  cubic  line  search  algorithm  [39]  with  the 

following  convergence  criteria  on  the  iterates  .  X^: 

K  n 


k<5  .  (6.4) 


node!  problem  with  the  appropriate  test  fvnctions  using  BP.  The  parameters 


c^2,  N»40  and  N  >50  were  chosen  and  each  problem  was  started  with 

max 

three  different  initial  guesses  with  C^>20%,  60%  and  100%.  . 

The  results  were  remarkably  uniform  over  all  test  cases  of  model  problem, 

1  2 

test  function  wd  initial  guess.  Choices  and  performed  almost 

identically  with  X^  just  under  a  factor  of  2  better.  Almost  always  only 

one  extra  function  evaluation  was  required  for  X^.  This  extra  comrutation 

n 

almost  exactly  balances  the  savings  in  the  reduced  number  of  iterations  so 
that  all  three  choices  incurred  the  same  computational  cos  t in  reduction  of 
F(G^)  to  a  specified  value.  However  the  greater  flexibility  of  X^  led  to 
its  adoption  in  all  siibsequent  calculations. 

We  next  atten^ted  to  estimate  the  local  ill-conditioning  in  the  problem 


by  adding  a  small  perturbation  of  size  to  the  data  and  starting  the 

''i 

algorithm  at  the  true  solution  G  of  the  unpertturbed  problem.  The  results 

were  inconclusive;  although  the  algorithtns  terminated  at  a  vector  S  such 

that  112^  -  G^II/IIg^II  ^£2'  this  was  due  to  the  conditions  r(2^)  <  lo”^  or 

n  >  N  being  satisfied,  emd  not  due  to  convergence  of  the  iterates  which 
mAX 

appeeured  to  cycle  around  some  fixed  point. 


Further  results  showed  that  the  problem  was  globally  ill-conditioned  in 
that  a  number  of  differing  functions  6^  were  found  starting  from  a  random 
guess  (e^  ■  100%)  such  that  F(8^)'~10  ^  for  example  2,  or  F(2^)~10  ^ 
for  exan^les  2a  and  2b,  but  with  -  G^H/UG^II  ~1.  This  suggests  that  the 


siurface  {(F(G},G):  GCci}  is  very  rugged,  as  expected  from  the  results  in 
Section  3  on  existence  of  multiple  solution  in  one  dimension.  The  results 
also  showed  that  the  iterates  determined  as  approximate  solutions  to  example 


2a  and  2b  were  definitely  more  acceptable  as  solutions  than  those  for  example 
2,  even  though  as  measured  by  F(G)  they  were  worse  by  a  factor  of  10  or 
more. 

A  good  initial  guess  is  of  great  help.  If  one  is  not  available  then  it 
is  tempting  to  start  with  guess  i,  however  some  analysis  [9]  has  shown  that 
if  the  iterates  have  some  symmetries,  then  such  syinnetries  will  be  preserved 
by  the  algorithm  regardless  of  the  form  of  the  true  solution.  Thus  the 
highly  symmetric  choice  of  =  0  is  to  be  avoided. 

An  attenpt  to  estimate  the  effects  of  ill-conditioning  due  to  the 

presence  of  ^  was  made  by  restricting  iterates  in  BP  to  the  span  of  the 

first  few  singular  functions  of  JT.  When  done  for  c^2  the  resulting 

iterates  reduced  F(G  )  at  a  slower  rate,  gave  final  iterates  S  for  which 

n 

X 

Rd  -  G  li  was  approximately  the  same  as  in  the  unfiltered  algorithm,  and  still 
failed  to  terminate  due  to  the  convergence  of  successive  iterates  but  rather 
ended  due  to  the  satisfaction  of  one  of  the  other  two  conditions.  For  this 
value  of  c  there  are  approximately  nineteen  significantly  nonzero  singular 
values:  the  results  indicate  that  even  over  this  subspace  the  function  F(G) 
is  widely  varying  due  to  the  nonlinearity  and  existence  of  multiple  solutions. 
It  therefore  seems  likely  that  a  severe  restriction,  e.g.  to  the  span  of  the 
first  five  singular  functions,  is  necessary  to  provide  an  easily  solved 
problem. 

The  numerical  results  presented  are  some  typicaZ  examples  of  the  behavior 
of  the  algorithms  on  model  problems  2  and  2a,  with  parameter  pairs  (c,N)  of 
(1.5,22),  (2,32)  and  (2,40)  and  test  fxinction  G,  Eg.  (6.1).  For  numerical 


results  concerning  model  problem  2b,  see  [15] .  Tables  1  and  2  first  give 

the  average  number  of  iterations  n  and  average  final  value  •  F(G-.)  of  RP 

n 

when  iterated  to  convergence  on  several  different  initial  guesses,  each 

perturbed  by  an  amount  from  the  true  solution.  The  remaining  entries 

are  the  ratio  n/n,  where  is  the  number  of  iterations  required  by  the 

remaining  algorithms  to  reduce  F(Gv)  to  below  F(G  ).  Cases  where  the 

n  n 

algorithm  on  trial  failed  to  reduce  F(G^)  to  within  100  F(G^)  are  denoted 

"  n 

by  a  Figxires  4-7  show  4  typical  final  iterates  reached  by  these  algo¬ 

rithm  for  varying  problems  and  values  e^. 

The  filter  parcuneter  of  Eg.  (5.14),  (5.15),  (5.22)  and  (5.23)  at 

the  n-th  iteration  was  calculated  from  the  quantity 

e  «  max{min{4llX  -H  ,  II  -  .025,  .2) ,  .002}  .  (6.5 

n-1  n-l 

For  FN  and  FNP,  was  taken  to  be  p;  for  SQFN  and  SQFKP, 

This  expression  was  calculated  by  trial  and  error  and,  although  by  no  means 


the  last  word,  at  least  has  the  property  of  giving  search  direction  su 

that  the  associated  step  length  X  almost  always  lay  in  the  interval 

n 

[.1,1.5]  and  X  ~1  when  close  to  the  true  solution, 
n 


7.  CONCLUSION 


The  ill-posed  nature  of  phase  retrieval  induced  too  much  variation  in 
numerical  calculations  to  allow  the  drawing  of  quantitative  judgements  from 
these  results,  but  some  qualitative  remarks  can  be  made.  For  these  one¬ 
dimensional  problems  FP  was  clearly  the  least  expensive  in  terms  of  total  computa¬ 
tional  cost  to  reach  a  desired  objective,  and  it  is  difficult  to  see  hou’  any 
other  algorithm  can  be  improved  through  taking  advantage  of  sparse  Hessians, 
etc.  to  seriously  challenge  RP  for  this  position.  However  SQFNP  was  the 
most  robust  in  producing  acceptable  iterates  from  almost  any  starting  point , 
and  although  it  did  not  display  to  the  same  degree  the  apparent  quadratic 
convergence  of  FN  and  FNP  when  close  to  the  true  solution,  this  could 
possibly  be  remedied  by  a  better  choice  of  filter. 

Efforts  to  estimate  the  role  of  local  ill-conditioning  and  filtering  in 
phase  retrieval  problems  were  largely  frustrated  by  the  effects  of  global 
ill-conditioning  due  to  the  existence  of  multiple  solutions.  This  makes  one 
dimensional  problems  unsolvable  for  practical  purposes,  but  it  is  expected 
that  in  higher  dimensional  problems  of  real  interest,  with  fewer  possible 
exact  solutions,  that  local  ill-conditioning  will  become  dominant.  Hopefully 
this  may  be  removed  by  filtering  and  quadratically  convergent  algorithms 
will  have  a  wider  domain  of  convergence,  thus  becoming  competitive  with  the 
present  first  order  iterative  schemes. 
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Solution  G  to  exan^le  2  reached  from  an  initial  guess  w 

e^*20%:  -  exact  phase  from  Eq.  (6.1)j  o  reconstructed  phase 

-  exact  modulus  from  Eq.  (6.1);  ^  reconstructed  modulus. 


Solution  G  to  example  2a  reached  from  an  initial  guess 

ej^*60%:  -  exact  phase  from  Eq.  (6.1);  o  reconstructed  phase 

exact  modulus  from  Eq.  (6.1);  a  reconstructed  modulus. 


ABSTRACT 


Problems  involving  reconstruction  of  partially  known,  bandlimited 
Fourier  transform  pairs  from  noisy  data  are  now  regularly  encountered  in  a 
wide  variety  of  scientific  and  technical  areas.  This  paper  is  the  first  in 
a  series  of  studies  of  algorithms  for  their  solution.  These  studies  focus 
on  the  algorithmic  structure  with  respect  to  the  dominant  feature  of  such 
problems,  that  they  are  ill-posed.  The  algorithms  developed  here  are  for 
the  linear  prototype  problem;  namely  extrapolation  of  a  bandlimited  function 
known  over  a  finite  interval.  The  problem  is  cast  as  the  inversion  of  a 
linear  operator,  the  finite  Fourier  transform.  Criteria  can  be  deduced  from 
a  knowledge  of  the  spectrvun  of  this  operator  for  the  suitability  of  extrapola¬ 
tion  algorithms.  These  criteria  are  used  to  evaluate  existing  algorithms 
(such  as  the  Gerschberg-Saxton)  as  well  as  our  new  algorithm  based  upon  in¬ 
version  of  a  Galerkin  approximation  to  the  operator  using  singular  value  de¬ 
composition.  Numerical  results  on  the  relative  merits  of  different  discretiza 
tions  in  our  algorithm,  and  on  its  success  in  extrapolation  in  two  examples 
(with  optical  diffraction  interpretations)  with  noisy  data  are  presented. 


1. 


INTRODUCTION 


A  recurring  problem  in  many  fields  is  the  reconstruction  of  a 
Fourier  transform  pair  g  ,  G  from  partial  data  on  either  or  both  fvmc- 
tions.  Considerable  effort  has  been  expended  in  the  development  of  algorithms 
for  its  solution;  although  there  have  been  some  successes,  the  problem  has 
generally  proved  to  be  difficult.  In  a  series  of  papers,  of  which  this  is 
the  first,  we  aim  to  detail  the  features  that  cause  difficulty  for  numerical 
calculations  and  to  construct  algorithms  that  take  explicit  account  of  such 
features.  It  is  not  possible  to  remove  such  difficulties,  but  if  they  are 
ignored  in  the  construction  of  algorithms  they  invariably  surface  later  in  the 
solution  in  a  more  inconvenient  form. 

The  canonical  examples  for  such  reconstructions  are: 

Exampie.  I,  ExXAxipo lotion  Band- Limltzd  Signals.  Given  a  noisy 

measurement  of  g  on  an  interval  knowledge  that  G 

vanishes  outside  the  bounded  interval  9  ® 

the  entire  real  line. 

Example  1  is  the  archetypical  linear  problem  in  transform  recon¬ 
struction.  A  number  of  algorithms  have  been  proposed  for  its  solution,  either 
by  iterative  meeins:  Gerschberg  and  Saxton  [1],  Papoulis  [2],  Youla  [3]  or  by  direct 
means:  Cadzow  [4],  Sabri  and  Steenaart  [5]. 

Examplz  2,  Thz  PhcUtZ  PAoblem.  Given  a  noisy  measurement  of  Ig| 
on  an  interval  knowledge  that  G  vanishes  outside  the  bounded 

interval  [bj^,b2l,  reconstruct  g  and  G  on  the  entire  real  line. 


Example  2  has  been  of  theoretical  interest  for  some  time;  see 
for  example  Burge,  Fiddy,  Greenway,  and  Ross  [6] ;  however  in  this  most 
general  form  it  has  proved  intractable.  Numerical  solutions  obtained  in 
particular  cases  have  done  so  by  (sensibly)  incorporating  further  knowledge 
of  g  and  G  .  Two  such  examples  are: 

Exampte  2a,  The  Two  ModiitL  p^btesn.  Given  noisy  measurements  of 
|gl  on  [a^,a2]  ,  |g|  on  knowledge  that  G  vanishes 

identically  outside  of  [b^,b2l,  reconstruct  g  ,  G  over  the  entire  real 
line. 

Example  2b,  The  Phase  Problem  Mlth  Nonnegatlv-ity  ConsViaints. 

Given  noisy  measurements  of  |g|  on  knowledge  that  G  is 

nonnegative  over  [b^,b2]  and  vanishes  identically  outside  [b^jb^l,  re¬ 
construct  g  ,  G  over  the  real  line. 

Gerschberg  and  Saxton  [1]  developed  a  widely  used  algorithm  for 
Example  2a;  almost  all  the  iterative  algorithms  for  the  solution  of  the  gen¬ 
eral  reconstruction  problem  are  suitably  modified  versions  of  this  particular 
case.  The  Gerschberg-Saxton  algorithm  (hereafter  denoted  as  the  GS  algorithm) 
in  its  general  form  is  essentially  the  steepest  descent  algorithm  with  unit 
step  length  and  so  is  first  order  [7].  An  alternative  second  order  algorithm, 
based  on  Newton's  method,  has  been  proposed  by  Barakat  and  Newscim  [si.  In  [9], 
the  GS  algorithm  is  successfully  applied  to  a  problem  of  the  type  in  Example  2b 

The  ceuionical  examples  presented  are  simple  in  form,  nevertheless 
they  contain  the  salient  features  that  make  other,  more  complicated,  problems 
intractable.  The  font  of  all  difficulties  is  that  the  reconstruction  problem 
1*5  ill  posed,  euid  badly  ill  posed  at  that.  The  original  definition  of  a 
well  posed  problem  is  due  to  Hadamard  (10). 


Vz^^ruX^on. 


A  problem  is  well  posed  if  the  solution 


1.  exists 

2 .  is  unique 

3-  depends  continuously  on  the  data. 

If  a  problem  violates  any  of  these  conditions,  it  is  ill  posed. 

Example  1  satisfies  only  condition  2;  therefore  it  is  ill  posed. 

To  see  that  it  fails  condition  1,  it  suffices  to  note  that  g  is  the  trans¬ 
form  of  a  function  with  bounded  support  so  that  it  is  analytic  by  the  Paley- 
Wiener  theorem  [II].  Any  perturbation  of  the  true  g  by  noise  that  is  not 
analytic  produces  a  perturbed  problem  with  no  solution.  Analyticity  does 
imply  condition  2,  but  also  that  condition  3  will  fail  very  badly.  This  is 
due  to  the  natural  error  metric  for  noise  being  the  standard  energy  norm, 
which  is  of  no  use  as  a  measure  of  analyticity.  Therefore  noise  that  is 
arbitrarily  small  in  the  error  metric  will  produce  arbitrarily  large  changes 
in  the  solution,  or  even  cause  it  not  to  exist.  This  aspect  of  the  problem 
is  discussed  in  section  2. 

Exanple  2  fails  all  three  conditions;  condition  1  for  the  same 
reason  as  Exeui^^le  2.  Akutowicz  [12,13]  shows  the  existence  of  multiple 
solutions  for  the  phase  problem  both  with  and  without  nonnegativity  con¬ 
straints.  It  is  not  known  whether  the  two  moduli  problem  has  multiple  solu¬ 
tions,  except  for  some  trivial  multiplicities  due  to  symmetry  and  constant 
phase  factors.  As  for  condition  3,  Example  2  can  be  locally 
linearized  into  Example  1,  so  it  also  fails  this  condition. 
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Many  researchers  do  not  seem  to  be  aware  of  the  full  implications 
of  these  failures  for  numerical  algorithms  for  approximate  solutions. 

Gerschberg  and  Saxton  have  been  careful  to  show  that  their  algorithm  is 
always  error  decreasing  and  to  give  examples  of  behavior  using  noisy  data. 
Chapiticin  [  14]  modified  their  algorithm  to  make  even  greater  use  of  known  noise 
levels.  However  the  overall  approach  is  still  on  an  ad  hoc  basis,  noise  cannot 
be  explicitly  handled  in  a  satisfactory  fa-shion  in  an  iterative  algorithm. 
Papoulis  [2]  gives  error  bounds  for  his  iterative  method,  but  not  a  way  of 
incorporating  them  efficiently  into  calculations.  It  will  be  shown  in  this 
paper  that  direct  methods  do  allow  development  of  robust  algorithms.  The 
two  direct  methods  proposed  in  Cadzow  [3],  Sabri  and  Steenaart  [4]  call 
for  inversion  of  ill  conditioned  matrices  formed  with  no  concern  for  optimiz¬ 
ing  the  condition  number.  The  optics  literature  contains  many  references 
to  "the  number  of  degrees  of  freedom"  [15,16]. 

However,  the  natural  corollary  has  not  been  noted;  nimierical  solutions  to  this 
problem  should  contain  discretizations  of  approximately  this  size  and  no  signifi 
cant  improvement  in  results  can  be  obtained  by  using  greater  numbers  of  points, 
approximating  functions,  etc.  Ignorance  of  this  point  has  lead  to  unnecessarily 
long  calculations  in  the  belief  that  more  points  give  greater  accuracy. 

This  paper  aims  to  provide  a  detailed  study  of  Example  1  in  the 
language  of  numerical  analysis,  and  to  use  this  knowledge  to  construct  numeri¬ 
cal  algorithms  for  its  solution  in  a  rational  manner.  Section  2  contains  an 
exposition  of  the  structure  of  the  finite  Fourier  transform  and  the  linear 
operator  associated  with  Example  1.  Using  this  structure  in  conjunction  with 
the  theory  of  singular  value  decomposition,  the  solution  is  decomposed  into 


two  parts;  the  first  of  which  is  finite  dimensional  and  contains  significant 
information  in  the  presence  of  noise, while  the  second  contains  no  trustworthy 
information  in  the  presence  of  noise.  In  section  3,  numerical  algorithms  are 
proposed  that  efficiently  identify  these  two  components,  in  addition  a  list 
of  desirable  features  of  possible  discretizations  is  given.  Section  4  contains 
an  exposition  of  iterative  methods,  and  a  comparison  of  their  merits  with  those 
of  the  direct  methods  of  section  3.  Finally  section  5  contains  a  discussion 
of  some  particular  discretizations  with  numerical  calculations  showing  their 
robustness  in  the  presence  of  noise. 

A  clear  exposition  of  the  linear  problem  is  important  not  only  in 
its  own  right,  because  as  mentioned  eibove  the  local  structure  of  nonlinear 
reconstructions  can  be  approximated  by  linearization  in  the  neighborhood. 

The  results  of  section  2  on  this  linearization  show  that  the  nonlinear  re¬ 
constructions  are  essentially  problems  over  finite  dimensional  manifolds. 

If  Newton's  method  is  used  to  carry  out  these  reconstructions,  an  accurate 
efficient  way  of  identifying  the  tangent  plcuie  to  this  manifold  is  needed. 
Reference  [ 8 ]  shows  that  singular  value  decomposition  can  accomplish  this 
task.  We  believe  that  the  results  of  this  paper  show  that  Newton's  method 
coupled  with  careful  discretizations  for  nonlinear  reconstructions  can  be 
competitive  with  the  widely  used  iterative  methods. 
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THEOBY  OF  THE  MODEL  LINEAR  PROBLEM 


This  section  contains  an  analysis  of  the  theory  of  the  linear 
reconstruction  problem  discussed  in  the  previous  section.  A  formal 
description  is: 

Given  a  function  g(v)  measured  on  v  e  A  =  that  is 

knovm  to  be  in  the  Fourier  transform  of  a  function  G{co)  which  has  support 
contained  in  B  =  [b^,b2] ,  extend  g(v)  to  a  function  g(v)  defined  on  the 
entire  real  axis. 

For  convenience  we  introduce  the  following  notation. 

2 

Ve.^yCn.itlon  I.  Let  G(iis)  e  L  ,  the  Fourier  transform  g(v)  of 
G(oj)  is  defined  as 


g(v) 


^  f 

•'-O 


2'iriva)  _ ,  .  - 
e  G  (0))  du) 


(2.1) 


and  denoted  by 


g  = 


(2.2) 


The  inverse  tremsform  is  denoted  by 


G  =^‘^g 


(2.3) 


Vt^-LyiLtLon  2.  Let  S  be  a  subset  of  the  real  line.  The  projec¬ 


tion  operator  on  L  associated  with  S  is  denoted  by  P  euid  defined  by 


(P_f)(v)  =  f(v) 


V  e  S 


In  this  notation,  the  model  problem  is:  Given  a  measurement  g  , 
find  g  and  G  such  that 


g  =  Pg  jrP^G  ,  g  =  ,^P^G  (2.5) 

The  problem  can  be  symmetrized  using  simple  translations  and  scaling  *"0  read: 
Given  a  measurement  g  ,  find  g  ,  G  such  that 


where 


g  =  PjrPG  ,  g=,^PG  (2.6) 

c  c  c 

-  1*  1/2 

C=  [-c,cl  ,  c  =  j  [(a^ -a^)  (b^-b^)]'"'^^  (2.7) 


To  solve  Equation  2.6,  we  need  to  convert  the  linear  operator 

P  P  .  As  noted  in  the  previous  section  this  is  an  ill  posed  problem, 
o  c 

By  the  Paley-Wiener  theorem  [11];  g{v)  ,  the  Fourier  transform  of  a  ftanction 
with  bounded  support  is  analytic;  consequently  it  has  a  unique  extension  g(v) 
Therefore  the  model  problem  is  one  of  analytic  continuation  which  is  known  to 
be  highly  sensitive  to  measurement  error  in  g(v),  see  [1,8].  In  order  to 
make  quantitative  judgements  on  the  effects  of  noise  in  Equation  2.6,  we  shall 
use  the  concept  of  singular  value  decomposition.  We  assume  the  reader  to  be 
familiar  with  this  technique  applied  to  either  infinite  dimensional  compact 
linear  operators  or  to  finite  dimensional  matrices;  a  discussion  of  the  first 
case  can  be  found  in  Baker  [17],  of  the  second  in  Stewart  [18]. 


In  a  series  of  papers,  Slepian  at  al.  [19-21],  the  stnacture  of 
P^  P^  was  fully  explored.  The  key  results  are  repeated  here: 


1)  P  P  is  a  normal  compact  operator,  and 


2)  The  eigenfunctions  of  taken  as  real.  They 

are  then  the  prolate  spheriodal  wavefiinctions  here  denoted  by  <h  (v,c)  , 

n 

making  explicit  the  dependence  on  c  . 


3)  The  eigenvalues  X  (c)  satisfy 

n 


X  (c)  =  e^™^^  lx  (c) 


(2.9) 


^  XI  >0 

'  n '  '  n+1 ‘ 


4)  The  singular  values  are  denoted  by  o  (c)  with 

n 


0  (c)  =  |X  (c) 
n  n 


(2.10) 


They  behave  as  follows 


a.  For  a  fixed  c  and  increasing  n 


a  (c)  (cVn)^” 

n 


(2.11) 


b.  For  a  fixed  n  ,  there  exist  constants  a  ,  6 

n  n 

such  that  for  increasing  c 


o  (c)  't  1  -  a  e 
n  n 


(2.12) 


Since  ^  is  compact,  it  has  a  singular  value  decomposition 
U  ,  I  ,  V  where  U  amd  V  are  complete  sets  of  orthonormal  functions,  euid 
£  is  the  decreasing  sequence  of  nonnegative  numbers  defined  in  Equation  2. ID 


Since  P  ^  P  is  normal 


U  = 

ju  d)  (u), 
'  n  n 

k  00 

(2.13) 

V  = 

{v  (p  (U), 
'  n  n 

k  ^ 

where  u 

n 

and 

V 

n 

are 

constants 

such  that  |u  1  =  Iv  1 
'  n '  '  n' 

=  1.  With  this 

machinery, 

the 

solution 

to  the  problem  can  theoretically 

be  determined  by 

expainding 

g  and 

G  as 

00 

g(v) 

=  E 

n=0 

(2.14) 

00  ‘ 

G(U)) 

=  E 

n=0 

(2.15) 

The  coefficients  a  are  determined  directly  from  the  knovm  §  amd  the  b 

n  n 

from  the  relation 

g  =  P  ^  P  G  (2.16) 

c  c 

the  final  result  is 

a 

b  =  —  (2.17) 

n  o 

n 

In  this  representation,  the  degree  of  ill  conditioning  of  Equation 

2.6  (i.e.  the  measure  of  how  ill  posed  Equation  2.6  is)  as  well  as  the 

effect  of  noise  can  be  determined  from  Equation  2.17  and  the  behavior  of 

a  .  Intuitively  since  o  -►  0  ,  by  Equation  2.12  a  small  error  in  the 
n  n 

coefficients  a  of  a  high  frequency  (large  n)  component  a  v  (|)  (v,c)  of 
n  n  n  n 

g(v)  will  induce  a  large  error  in  the  coefficient  b^  ,  given  by  Equation 
2.17,  of  the  corresponding  component  of  G(u)}.  A  more  precise  statement  is 
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Le/nma  1  Suppose  the  relative  error  (in  the  L  norm)  on  the 
measured  g  is  e  ,  and  the  maximum  absolute  error  that  will  be  toler¬ 
ated  in  G  is  6.  Then  there  is  a  decomposition  of  G  into  two  components 
G^  and  G^  ,  in  which  G^  is  always  accurate  within  an  error  6  and 

G^  cannot  be  guaranteed  to  be  this  accurate. 

Pkoo^:  Let  G  i.e  decomposed  as  the  sum  in  Equation  2.15.  For  a 

fixed  N  ^  the  worst  possible  absolute  error  in  the  first  N  components  of  G 

is  induced  by  assuming  that  a^ , . . . , are  precisely  known  and  that 

a  =  1,  a_  =  •  •  •  =  a.,  ,  =  0  ,  a„  is  in  error  by  e  and  a.  =  0  for  i  >  k- 

1  2  N-1  N  1 

Then 

b  =  I/O  ,  b_  =  •••  =  b„  -  =  0  ,  b.  =  0  for  i  >  N 

11  2  N-1  1 

thus  the  absolute  error  induced  is  (CO./a  ) .  If  N  is  chosen  so  that 

X  N 

<  6  <  (eVVl’ 

then  the  first  N  con^nenets  of  G  are  always  known  to  within  an  absolute 
error  6  ,  but  the  remaining  components  are  not  that  trustworthy. 

But  as  the  decomposition  allows  a  direct  observation  of  the  effects 
of  noise,  so  also  does  it  allow  direct  action  to  reduce  these  effects.  A 
filter  f(a,c)  dependent  on  the  noise  levels  e  ,  6  is  introduced  so  that  G 
is  estimated  by  G' 

00 

G' (co)  =  Y.  a  f{a  )(})  (UJ,C)  (2.20) 

n  n  n  n 
n®0 


where 


as 


0  -*•  1 
n 


(2.21) 


f(a  )  (0  ) 

n  n 


-^0  as  0  -►  0 

n 

In  order  to  construct  a  sensible  filter  for  the  general  problem, 
we  first  note  the  following  conclusions  from  Equations  2.8  -  2.12. 

a.  P  . jT  P  is  badly  ill  conditioned  because  0  (c)  exhibits 

c  c  n 

exponential  decay  in  n 

b.  For  a  given  c  ,  the  singular  values  are  approximately 
distributed  as  a  step  function  in  that  there  exists  eui  n (c) 
such  that 


n  <  N  (c)  «•  0  (c)  'u  1 
n 

(2.22) 


n  >  N  (c)  ^  0  (c)  -v  0 

n 


c.  For  a  given  c  eind  noise  levels  e  ,  6  let  N(e,6,c)  be  the 
N  of  Equation  2.19.  Then  N(c,5,c)  is  bounded  by  Equation  2.12 
Furthermore  from  conclusions  a  and  b,  n(£,5,c)  is  relatively 
insensitive  to  c  and  6  (i.e.  for  almost  all  e  and  6  there 
exists  a  p  <<  1  such  that 


N(e,6,c)-1  <  N(e,p6,c)  <  N(e,6,c) 

<  N(pe,6,c)  <  N(e,6,c)  +  1 


(2.23) 


The  general  choice  of  filters  is  an  art,  however  here  we  make 


the  particular  choice 


f(o,e)  =  1/a 

=  0 


a  >  e/5 
0  <  e/6 


(2.24) 


relying  on  conclusions  b  and  c  above.  This  filter  effectively  truncates 
the  series  in  Equation  2.20  leaving  only  the  trustworthy  component  G  ,  of 
Lemma  1.  Since  the  filter  is  insensitive  to  e  and  6  ,  the  number  of 
elements  in  the  truncated  sum  of  Equation  2.20  is  insensitive  to  noise  levels 
It  corresponds  to  the  "essential  number  of  degrees  of  freedom"  of  the  model 
problem  mentioned  in  section  1.  We  stress  that  this  number  is  relatively 
noise  independent  (i-e.  a  function  of  c  with  veak  dependence  on  e  and  6  ) 
and  that  it  represents  the  number  of  components  of  the  solution  that  are  of 
guaranteed  accuracy. 


Based  on  these  results,  the  following  algorithm  is  proposed.  For 

a  given  c  ,  e  ,  emd  5  calculate  N(c,e,6).  If  it  is  sufficiently  small, 

accurate  numerical  approximations  to  a  ,  ({)  (v,c)  are  calculated  for 

n  n 

n  <  N(c,c,5)  2uid  G  is  estimated  using  Equation  2.20.  This  procedure  is 
optimal  in  that  it  focuses  on  accurately  calculating  components  of  g  that 
are  significcuit  in  estimating  G  v^ile  ignoring  components  that  are  useless 


for  estimation. 


3.  DISCRETIZATION  AND  ASSOCIATED  THEORY 


In  this  section  we  consider  a  general  class  of  finite  dimensional 
approximations  to  g  ,  G  and  weigh  the  merits  of  different  approximations. 
The  discretizations  considered  are  derived  using  Galerkin's  method,  they  are 


based  upon  the  natural  error  metric 


for  the  model  problem.  We  choose 


two  sets  of  linearly  independent  functions  on  L  [-c,c] 


(3,1) 


and  introduce 


*  •  K 

Ve.^/Jiction  3.  Let  S  denote  the  subspace  spanned  by  {4^.  » 

]\  Jv  JC~X 

Sj^  the  subspace  spanned  by  projection  operator 

2  2 
from  L  onto  S  ,  Q  the  projection  operator  from  L  on  S  . 

K  L  X# 

We  generate  approximations 


e  G 

li  1j 


(3.2) 


by  requiring  that  g  ,  G  minimize 

iv  id 


|g“h|| 


h  e  s„ 


(3.3) 


(3.4) 


Since  S  ,  S  are  finite  dimensional,  such  approximations 
K  L 

always  exist.  By  expanding  g  ,  G  in  terms  of  the  basis  functions 

K  L 


Vk 


(3.5) 
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and  defining  the  quantities 


=k ' 


(3.6) 


A. ,  =  (6.  ,  e.) 

i:  ID 


(3.7) 


then  Equations  3.3  and  3.4  can  be  rewritten  as  discrete  equations  for  the 

vectors  a  =  (a-,...,a^)^  ,  B  =  (b, _ _ ,bv)^  . 

1  L  1 


We  note  that 


g  minimises  Equation  3.3  B  B  =c 
K 


(3.7) 


G  minimizes  Equation  3.4  «*  a  minimizes 

L 


B  ^(c-Dcblj  SeR 


(3.8) 


where  the  matrices  B  ,  D  are  defined  in  Equations  3.6  and  3.7.  For 
typographic  convenience,  the  subscript  2  will  be  omitted  from  |1  •  ||  . 


In  this  construction,  there  is  considerable  freedom  of  choice  for 
the  subspaces  S  ,  S  cuid  their  bases;  it  is  therefore  appropriate  to 

K  1j 

adopt  criteria  by  which  a  particular  choice  can  be  judged.  We  list  six 
criteria  for  evaluation  of  approximations  and  for  each  criterion  list  features 
or  bases  that  produce  cui  approximation  giving  good  results.  We  Ceui  then  seek 
a  "best  fit"  approximation  to  these  features. 

/.  Uitcquzms^  app^xmatLon:  Because  ||•||  is  a  strictly 

convex  norm  g  is  unique.  If  K  <  L  then  G  is  not  unique;  if  K  >  L 
K  L  “ 

then  G.,  is  unique  unless  P  JTp  s  contains  components  orthogonal  to  S^. 

L  c  c  L  K. 


Alignments  of  this  form  are  pathological;  the  practical  problem  is  the 
occurrence  of  ntOAZy  orthogonal  components.  These  components  will  contri¬ 
bute  to  ill  conditioning  in  D  ,  however  both  the  pathology  and  the  ill 
conditioning  can  be  avoided  by  choosing  so  that 


P 

c 


C 

C  L  — 


(3.9) 


Because  the  true  solution  is  unique  there  appears  to  be  no  gain 

in  constructing  nonunique  approximations.  Henceforth,  we  assume  that  K  >  L 

and  that  G  is  unique. 

Xj 

2.  AccuAacy  app^xZ/ndtiom:  An  obvious  requirement  for  any 
sequence  of  approximations  g  ,  G  is  that  _ 

K  Jj 

lim  llg-g  11  =0  (3.10) 


and  if  g  is  noiseless  that 


lim  11g-G  11  0  (3.11) 

Necessary  conditions  for  Equations  3.10  and  3.11  to  hold  are  that  the  sequences 

-1  ^  con^jlete  sets  in  L  [-c,c].  These  conditions  are  also 

sufficient  for  Equation  3.10,  but  not  for  Equation  3.11  because  the  inverse 

of  p  ^  P  is  imbounded 
c  c 

The  important  issue  for  numerical  calculations  is  the  rate  of  con¬ 
vergence  of  such  approximations.  Rather  than  attempt  direct  estimates  of 
11g-G^11  we  consider  a  more  easily  computed  quantity.  The  approximating 


subspaces  chosen  implicitly  define  an  associated  finite  dimensional  operator 


(3.12) 


The  quantity  II  ^  ^  II  then  serve  as  a  measure  of  the  accuracy 

KXj  C  C 
of  the  approximation. 

We  therefore  seek  fiinctions  'I'l,  »  such  that 

iC  X' 

,  G  (3.13) 

and  that  minimize  this  error  metric.  The  following  proposition  provides  a 
partial  answer 

PfiCpOiitZon  U  If  K  =  L  =  N  and  ,  6^^^  are  chosen  so  that 

4>-r,  (v)  =  0  (v)  =  4)  (v)  (the  k-th  eigenfunction  of  P  ^  P  )  ,  then  the 

K  K  K  C  C 

Galerkin  approxiination  generated  by  Equations  3.3,  3.4  satisfy  Equations 
3.10,  3.11  as  N  .  Furthermore  for  any  other  subspces  S„,  ,  S  ,  ,  with 

Jv  Ju 

K'  ,  L'  <  N  , 

jrpJI  <  (3.14) 

The  proof  of  this  and  the  succeeding  two  propositions  can  also  be  found  in 
[  ].  Thus  the  eigenfunctions  are  an  "optimal”  basis  for  construction  of 

approximations  under  this  criterion 

3.  Conditioning  o^  A,  B,  D:  The  previous  criterion  aids  in 

distinguishing  suitable  approximating  subspaces  S  ,  S  .  One  must  now  make 

a  choice  of  bases  within  these  subspaces.  It  is  possible  to  choose  ,  6^^^ 

so  that  D  is  well  conditioned,  but  this  is  an  illusory  gain  because  the 

matrices  B  and/or  A  will  then  be  ill  conditioned.  Therefore  calculations 

of  9„(v)  ,  G. (w)  made  by  using  the  representations  in  Equation  3.5  will  be 
K  L 
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error  prone.  Since  the  most  informative  measure  of  ill  conditioning  in  a 
metric,  the  condition  number,  is  not  easily  calculated;  we  use  a  less  infonr.e- 
tive,  but  more  easily  manipulated,  measure  for  comparison  of  different 
bases,  the  determinant. 

VfijOpo^iX/ion  2;  Let  S  ,  S  be  fixed  subspaces  with  orthonormal 

K  Li 

bases  '  '*-®5,^a=l  '  ^’^k^k=l  '  other  bases 

for  S  ,  S  and  the  corresponding  matrices  A,A',B,B',D,D'  are  defined 
K  L 

by  Equations  3.5,  then 

Idet(B'D'A') 1  <  |det(BDA) 1  =  |det(D)|  .  (3.15) 

PfLOpo.iAjU.on  3:  Let  K  =  L  =  N  ,  1];^^  =  6^^  =  4>j^  .  Then  for  any  other 

subspaces  S  ,  S  ,  with  K'  ,  L*  >  N 

Idet  D' I  <  |det  d|  (3.16) 

These  two  propositions  demonstrate  that  under  this  measure  of  ill  conditioning, 
orthonormal  bases  should  be  chosen  and  that  the  best  choice  of  bases  is  the  eigen¬ 
functions  . 

4.  VziAJLobZz  ipzciaZ  fiziutti:  Particular  choices  of  bases 

yield  interesting  additional  results.  If  both  bases  are  orthonormal,  then 

the  spectral  properties  of  b  are  identical  with  those  of  R  ,  which  in 

iCu 

turn  approximate  those  of  P^dTp^.  if  K  =  L  and  ,  6j^  satisfy 

l-k  *  '■c  "c  \ 

then  the  Galerkin  approximation  is  also  a  least  squares  solution,  i.e.  G 

Li 

minimizes 
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(3.18) 


Both  these  results  are  obtained  if  K  =  L  =  N  ,  ip,  =  6,  =  (t>,  . 

k  k  k 

5.  PA.eAe.nce.  no^e:  The  above  criterion  are  concerned  only 
with  properties  of  the  approximations.  The  important  property  of  the  data, 
that  it  contains  noise,  must  be  considered.  In  section  2  we  showed  that  it 
forced  a  division  of  the  solution  into  trustworthy  and  untrustworthy  components. 
The  approximations  therefore  must  echo  this  division  as  accurately  as  possible. 

6.  EoAe.  0^  compatAtAOn:  in  light  of  the  above  discussion 

the  obvious  choice  for  a  basis  are  the  eigenfunctions,  but  we  do  not  consider 
their  use  as  there  is  no -known  simple  numerical  algorithm  for  their  calcula¬ 
tion.  To  keep  computing  costs  down  roust  have  a  simple  closed  form 

representation.  Furthermore  the  functions  9.  should  also  be  available 

C  JC 

in  a  closed  form  so  that  calculation  of  an  approximate  interpolation  g  to  g 

L 

is  cheap  compared  to  numerical  evaluation.  If  possible  the  inner  products 

(ip.,  P  P  0-)  should  also  be  found  explicitly. 
iC  c  c  x 

Taking  all  these  issues  into  consideration  we  seek  basis  of 

00 

simply  calculated  functions  '^hich  are  approximations  to  the 

eigenfunctions  <j>j^  ;  or  at  least  possess,  to  some  degree,  the  properties  dis¬ 
cussed  in  the  criteria  above.  In  particular  the  basis  should  be  complete, 
orthonormal ,  and  have  the  property  that  each  eigenfunction  can  be  expressed 
as  a  rapidly  convergent  series  in  . 
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Given  such  a  basis,  we  now  describe  an  algorithm  that  for  a  given 


observation  g  and  noise  levels  e  ,  6  produces  a  Galerkin  approximation 


G,  to  a  component  of  G  that  is  accurate  up  to  an  error  6  ,  as  in  Lemma  1. 
N 


1.  N  ,  K(N)  and  the  subspace  S  .  are  defined  by 

K  C  N  / 


a.  N  is  the  N(e,5,c)  of  Equation  2.19 


r  iK(N) 

*>•  =K(N1  ■  'Pfc'k-l 


c.  K(N)  is  the  least  integer  such  that  the  finite 


dimensional  operator  R  associated  with  K  =  L  =  K(N), 


(N)K(N) 


-  Pr  -  f  • 


(3.20 


2.  The  c  ,  f5  of  Equation  3.5  are  formed  for  this  choice  of  basis 
and  the  singular  value  decomposition  [18] 


B  =  Vl  t 


(3.21) 


calculated. 


3.  The  subspaces  S„ = S_  = S„  from  which  the  Galerkin  approximation 

K  L  N 


Gjj  will  be  formed  by  Equation  3.4  are  the  subspaces  of  ,  being 


the  span  of  the  first  N  eigenfunct ’ ons  of  R  ,  .  ,  ,,  i.e. 

K(N) K(N) 


=  ej^(v) 


K(N) 

L 


^ik  Pi^"> 


k  =  1, . . . ,N 


(3.22) 


4. 


The  approximant  g  to  g  from  S„  can  be  formed  from  the 
N  N 


approxmation  to  g  in  (i.e.  the  vector  c  ). 

Moreover  G.,  can  be  formed  directly  from  c  as 
N 


K{N) 

\  (3.23) 


with 


a  =  V  Z'*’  u'*’  c 


where 


if 


=  0  ,  otherwise 


(3.24) 


5.  The  approximate  extrapolation  g„  to  g  is  now  defined  as 

N 

Km) 

»  2:,  P)c<^> 

k=l 

The  algorithm  may  also  be  used  to  solve  the  inversion  of  noiseless  observations 
g  by  choosing  sequences  ,  6^^  such  that  -►  0  and  calculating  for 

each  ^  approximation  the  above  algorithm.  Then  in  the 

limit  G„,.  ,  converges  to  the  true  G  . 

N(K) 

The  construction  of  an  algorithm  for  the  inversion  of  Equation  2.6 

is  now  complete.  The  salient  features  are;  use  of  the  singular  value 

decomposition  of  P  .3^  P  ,  prior  estimation  of  the  number  of  degrees  of 

c  c 

freedom  N(e,6,  c)  present  at  given  noise  levels,  and  construction  of  a 
Galerkin  approximation  from  a  predetermined  set  of  functions  of  a  simple 
closed  form  that  approximate  the  first  N(e,5,  c)  eigenfunctions. 


A  final  note:  g  can  not  be  approximated  directly  from  the  si:b 


K(N) 


space  spanned  by  ^'^^v^k^k=l  '  often  done  with 


/  X  sin(v  4-  kVc) 
^  (v+kVc) 


The  untrustworthy  components  of  the  basis  must  first  be  filtered  out  by 
singular  value  decomposition  before  the  extrapolation  is  done. 


4. 


ITERATIVE  SOLUTIONS  AND  ASSOCIATED  THEORY 


In  contrast  to  the  direct  algorithm  of  the  previous  section  we 
now  consider  iterative  algorithms  and  integrate  them  into  the  body  of  theory 
already  bu'lt  up.  We  start  with  the  original  iterative  algorithm,  that  of 
Gerschberg  and  Saxton  [  1  ] ;  a  two-step  iteration  that  updates  the  approxima¬ 
tion  in  both  physical  and  frequency  space. 

The  formal  statement  of  the  algorithm  applied  to  the  model  problem 

is: 

a.  Initial  conditions 


g 


0 


=  0 


(4.1) 


b.  Update 


g^,  =P-jrp  G  +  g 
’n+1  c  c  n 


(4.2) 

(4.3) 


where  C  is  the  set  K-C.  An  alternative,  but  more  conventional,  form 
is  easily  obtained.  Concatenating  the  iterations  above  yields 


(4.4) 


G  =  p  G  +  g)  =  (I-P  ^"■^P  )G  +  P  g 

n+1  c  ccn  c  ccnc  c 


upon  noting  that 


P  G  =  G 
c  n  n 


Pc^  -  g 


(4.5) 


P  +  P-  =  I  (the  identity  operator) 


(4.6) 


Equation  4.4  has  a  computational  advantage  over  Equations  4.2,  4.3  because 
it  requires  evaluation  of  f\inctions  in  the  bounded  set  C  ,  instead  of  the 
infinite  domain  C  ,  during  the  iteration- 

Equation  4.4  is  recognizable  as  the  simplest  form  of  iterative 
solution  to  the  normal  equations  associated  with  Equation  2.6,  that  is 


P  G  *  P  .^“^P  g  (4.7) 

c  c  c  c  c 

The  normal  equations  are  not  usually  formed  for  ill  conditioned  systems  since 

they  have  even  poorer  conditioning;  if  P^  singular  values  , 

-1  2 

then  P  ^  P  ^P  has  singular  values  o  .  However  since  the  singular 
c  c  c  n 

values  display  the  step  like  behavior,  quoted  in  Equation  2.23,  for  almost 

2 

all  significantly  non  zero  singular  values  w  fa  1  .  Consequently 

little  is  lost  by  consideration  of  Equation  4.7. 

In  the  presence  of  noise,  the  iteration  based  on  the  update 
Equation  4.4  is  usually  modified  so  that  either  it  stops  when 

where  6  depends  on  the  noise  level;  or  the  damped  update 


is  used. 


G  ^  =  (1  -  X)  G  -  P  ^~^P  G  +  P  ^~^P  g  (4.9) 

n+1  n  c  c  c  n  c  c 

where  A  >  O  depends  on  the  noise  level.  The  update.  Equation  4.9 


if  iterated  to  completion  corresponds  to  a  choice  of  filter 


in  Equation  2.20.  However  iteration  to  convergence  is  impossible  in  practice, 


and  as  yet  there  is  no  good  theory  linking  termination  after  N  steps  with 
accuracy  of  approximation  or  noise  levels.  Therefore  the  clear  insights  singular 
value  decomposition  gave  in  modifying  direct  algorithms  to  take  account  of  noise 
have  no  analog  for  iterative  algorithms. 


There  is  a  large  literature  on  alternative  algorithms  for  iterative 
solutions  of  Equations  2.6  and  4.7,  eg.  [22].  The  most  attractive  of  these  is 
the  conjugate  gradient  method.  This  algorithm  uses  the  residuals 

r  =  P  g  -  P  ^*^p  JTP  G  (4.11) 

n  c  c  c  c  c  n 

in  the  following  iteration  scheme 

a.  Initial  conditions 


P 


0 


0 


(4.12) 


r 


0 


P  ^ 
c 


(4.13) 


b.  Update 


where 


a 

n 


G  ,  =  G  +  a  P 
n+1  n  n^n 


Vl  ■  'n.l  * 


(4.14) 

(4.15) 

(4.16) 


The  conjugate  gradient  method  stemds  midway  between  direct  and 


iterative  methods  since  it  can  be  shown  that  the  iterates  G.,  are  also 


Galerkin  approximations  for  Equation  4.7  formed  by  taking  K  =  L  =  N  , 

r  1 N 

S  =  3  =  spam  ir.  _  .  Although  these  approximations  satisfy  many  of 

the  criteria  of  section  3  (the  r^^  are  orthogonal,  ^  ^  exists)  , 

an  a  pKil'Ki  estimate  of  amount  of  computation  to  achieve  a  desired  accuracy 
cannot  be  made,  nor  can  the  iteration  be  modified  in  the  presence  of  noise 
as  is  done  for  the  direct  methods. 

If  the  iterative  algorithms  are  run  on  a  digital  computer,  a 
basis  is  implicitly  used  to  represent  the  iterates.  The  only  reason  for 
not  directly  using  this  basis  to  form  Galerkin  approximations  is  cost.  With 
present  day  computers,  direct  solutions  are  faster  tham  iterative  solutions 
to  linear  systems  for  all  but  the  largest  scale  problems.  This  fact,  combined 
with  the  theoretical  results  on  the  effective  finite  dimension  of  the  model 
problem  and  a  need  for  careful  treatment  of  noise  lead  us  to  the  conclusion 
that  iterative  algorithms  are  net  the  method  of  choice  for  this  problem. 


5.  ILLUSTRATIVE  NUMBERICAL  CALCULATIONS 

As  an  illustration  of  the  results  obtainable  from  the  previous 
analysis,  we  present  numerical  solutions  to  two  particular  problems.  The 
is  that  of  discriminating  between  possible  bases  for  representation 
of  g  ,  G  .  The  izcond  is  reconstruction  of  two  particular  G(ui)  of  inter¬ 
est  in  diffraction  optics  from  noisy  observations  g(v)  using  the  algorithm 
of  Section  3. 

Three  commonly  used  bases  for  reconstruction  are; 


Pjj(v)  =  aj^Pj^(v/c) 


k  =  0, 1, . . .  ,N 


(5.1) 


P^(V) 


=  cos(kiTv/2c) 


sin(  (k+l)TTv/2c) 


k  =  even 
k  =  0,1, ... ,N 
k  =  odd 


(5.2) 


P,;(v)  =  Y,, 


V  e 


!2ck  -  (N-i-l)c  2c(k-H)  -  (N+l)c 


(N+1) 


(N+1) 


]■ 


=  0 


k  =  0,1, ... ,N 
elsewhere 


(5.3) 


Here  Pj^(x)  is  the  kth  Legendre  polynomial,  and  ,  pj^,  Yj^  are  constants 

chosen  so  that  the  various  Pj^  are  orthonormal.  All  three  bases  may  be  seen 

2 

as  the  initial  segment  of  a  con^slete  basis  for  L  [-c,c].  Van  Buren  [23] 

used  p^  to  give  accurate  representations  of  the  prolate  spherioidal  wave 

2 

fianctions  /  the  Pj^  correspond  to  the  sine  function  basis  of  Equation  3.27 
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used  widely  in  optics  [24,25],  and  the  p  are  the  familiar  piecewise  constant 

/c 

functions  of  numerical  analysis  (often  used  implicitly  in  algorithms  based  on 
sampled  point  values  of  g(v)). 

If  a  basis  used  to  generate  approximate  reconstructions 

via  the  algorithm  of  section  3,  condition  l.c  provides  a  natural  merit  function 
for  the  basis. 

1.  For  fixed  £  and  c  ,  the  merit  of  the  basis  is  the  size  of 
the  least  integer  K  =  K(e,c)  such  that 

<e  (5.4) 

However,  as  estimation  of  the  norm  in  Equation  5.4  is  too  expensive  for  repeated 
calculation,  we  choose  a  slightly  different  merit  function. 

2.  For  fixed  e  ,  c  the  merit  of  the  basis  is  the  size  of  the 
least  integer  L  =  L(c,c)  such  that 

|o^-s^l  <  e  ,  for  all  n  (5.5) 

where  s^  are  the  singular  values  of  . 

Since  the  p  are  orthononnal,  the  s  are  also  the  singular 
K  n 

values  of  the  matrix  D  .  This,  together  with  knowledge  of  o  ,  allows  cal- 

n 

culation  of  L(e,c)  for  varying  p.  ,  e  and  c  . 


To  rank  the  above  bases,  their  L(e,c)  were  calculated  for 

c  =  .5,  1.0,  1.5,  2.0  and  e  =  .002,  .02.  The  results  appear  in  Tables  1 

and  2.  N(e,c)  denotes  the  number  of  o  greater  than  e  ;  L^{e,c)  ,  the 

n 

r 

L(e,c)  of  basis  ip.  )  •  Because  it  is  overly  costly  to  consider  every  value 

Jv 

of  N  ,  the  table  entries  give  an  interval  containing  N  rather  than  the 
exact  result.  The  singular  value  decompositions  of  the  matrices  D  were 
evaluated  using  the  subroutine  LSDVF  from  the  IMSL  library. 

The  tables  indicate  that  {p^}  has  by  far  the  highest  merit  rating. 
However,  the  complexity  of  programming  and  calculating  the  Legendre  polynomials 
eind  their  finite  Fourier  transforms  the  spherical  Bessel  functions  j^(y)»  [26] 

Pj^{x)  e^^'‘  dx  =  2i^  jj^(y)  (5.6) 

affects  this  rating.  We  feel  their  use  is  worthwhile  only  for  large  c  or 

2 

high  accuracy  computations.  The  Pj^  are,  suprisingly,  the  worst  of  the  basis 

functions;  it  requires  a  very  large  number  of  such  functions  to  approximate 

closely  the  higher  order  eigenfxuictions  '^’2k+l'  believe  this  is  due  to 

sin  [  (k+l)Trv/2c]  vanishing  at  the  endpoints  v  =  +  c  ;  however,  '^2k+l^-'^^  ^ 

2 

consequently  the  *(’2)^+1  produce  poor  approximations  in  this  region.  The  same 

problem  appears  to  a  lesser  degree  in  approximating  even  eigenfunctions 

at  the  endpoints  the  derivatives  of  cosCkirv/c)  vanish  whereas  (+c)/dv  ^  0 

2 

The  low  rating  of  the  marks  them  as  a  poor  choice  for  reconstruction  and 

imply  the  sine  functions  should  not  be  used  for  extrapolation. 

The  basis  of  choice  appears  to  be  {pj^}  ,  combining  adequate  approxi¬ 
mation  power  with  ease  of  confutation.  The  basis  also  has  natural  extensions 


to  other  problems.  For  greater  accuracy,  higher  order  splines  may  be 


preferable  in  use  to  reconstruction  problems  in  higher  dimensions 

over  arbitrarily  shaped  support  sets  A  ,  B  the  basis  extends  to  give  the 
usual  finite  element  type  approximations. 

These  results  on  Galerkin  approximations  serve  as  a  guide  to  construction 
of  good  discretizations  based  on  pointwise  quadrature  rules.  As  an  illustra¬ 
tion  we  present  some  results  on  a  discretization  of  Equation  (2.5)  on  Gaussian 
r  1 N 

quadrature.  Let  abscissae  of  the  N  point  Gaussian 

r  iN 

quadrature  scheme  on  [-c,c]  with  associated  weights  Then 

Equation  (2.5)  may  be  approximated  by 

yS  ^  A 

g  =  FWG  (5. 7) 


where  F  is  an  N  x  N  matrix  with  entries 

2lTipj^p^ 


(5.8) 


W  a  diagonal  matrix  with  entries  W  =U)  and  g  and  6  are  vectors  whose 
k-th  entries  are  (hopefully)  good  approximations  to 

However  in  this  particular  discretization  the  Euclidean  norms  of  the 

/N  2 

vectors  g  and  G  bear  little  relation  to  the  L  norms  of  g(v)  and 
G(to)  and  calculations  show  that  the  singular  values  of  FW  are  not  good 
approximations  of  those  of  Therefore  we  replace  Equation  (5.7)  by 

the  system 


(5.9) 


^1  _  ^1/2''  ^1  _  ^1/2'' 

Now  the  Euclidean  norms  of  g  =W  g  and  G  =W  G  should  coincide  with 


the  L  norms  of  g(v)  and  G((jj)  and  the  singular  values  of 
''1  _  ^1/2  ^^1/2 

F  =  (W  FW  )  be  close  to  those  of  P  . 

c  c 

We  have  calculated  the  L(e,c)  of  this  approximation  for  the  same 
values  of  £  and  c  used  above,  the  results  appear  in  Table  3.  It  appears 
that  Gaussian  quadrature  is  on  a  par  with  Galerkin  approximations  by  Legendre 
polynomials  as  an  efficient  discretization;  this  is  only  to  be  expected  given 
the  role  Legendre  polynomiaxs  play  in  Gaussian  quadrature.  However  in  general 
Galerkin  approximations  seem  preferable  since  they  ;'.jst  be  used  anyway  to 
choose  among  the  many  possible  discretizations  associated  with  a  given 
quadrature  rule,  and  to  perform  interpolation  within  the  interval  or  extra¬ 
polation  beyond  it. 

To  test  the  actual  inversion  of  noisy  data,  the  algorithm  of^ection  3, 

2 

with  basis  functions,  was  used  to  invert  the  following  data 

corrupted  with  random  noise: 


g^(v) 

sin  2Trv 

TTV 

,  |v|  <  1 

(5.10) 

92  (v) 

../sin  TTvV 
\  / 

2 

,  |v|  Cl 

(5.11) 

These  functions  can  be  interpreted  as  the  point  spread  functions  (=  diffraction 
images)  caused  by  an  aberration-free  slit  aperture  in  coherent  light  and  in¬ 
coherent  light  respectively.  The  true  solutions  to  the  inversion  problem  are: 


G2(u))  =  2(1  -  |u)|)  ,  |toj  <1 

=  0  It*)!  >  1  (5.13) 

The  G's  are  the  corresponding  optical  transfer  functions  in  the  diffraction 
imagery  interpretation.  In  both  cases  c  =  1.  The  calculated  inversions  were 
compared  to  the  true  G^  (a.)  on  ju-|  ^1,  and  the  approximate  extrapolations 
of  Equation  (3.26)  were  compared  to  the  true  g. (v)  on  |vl  <4. 

-2 

We  first  applied  the  algorithm  to  noiseless  data,  choosing  6  =  1,  e  =  10 
and  K(N)  =40  for  g  ,  K(N)  =80  for  g.  (in  both  cases  '' 

<  e)  .  The  resulting  approximations  to  G^  (w)  coincides  with  the  true  (oi) 
to  well  within  graphical  accuracy.  The  approximate  extrapolation  to  g^(v) 
also  coincided  to  the  true  g^(v)  to  within  graphical  accuracy.  The  approxi¬ 
mate  extrapolation  to  g2(v)  is  plotted  in  Figure  1,  the  solid  line  is  the 
true  g2 (v) . 

To  simulate  the  presence  of  noise  in  the  data,  whenever  a  value  of  g(v) 

was  required  in  the  numerical  integrations  used  to  calculate  the  components 
2 

Cj^  =  (g,Pj^)  »  a  random  perturbation  was  added.  That  is 

9p(v)  «  g(v)  + ey  max  |g(v)|  (5.14) 

v€c 

was  used  in  place  of  g(v).  The  constant  e  denotes  the  noise  level,  y  is 
a  r^uldom  variable  xiniformly  distributed  over  [-1,1].  For  this  artificial 
noise,  the  error  level  t  of  the  algorithm  was  taken  to  be  the  e  in  Equation 
(2.8).  Since  we  had  no  prior  expectations  on  the  accuracy  of  the  inversions, 
the  error  tolerance  6  of  the  algorithm  was  taken  to  be  unity. 
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To  test  the  ^obui-tne^6  of  the  algorithm,  reconstruction  of  data  with 
different  random  perturbations  of  different  noise  levels  were  made.  The  same 
K(N)  as  in  the  noiseless  case  were  used.  Table  4  lists  four  reconstructions 
of  (co)  from  data  with  1%  (i.e.,  e  =  .01)  noise  level;  there  is  approximately 

a  5%  maximum  deviation  from  the  true  value.  Figure  2  shows  four  reconstructions 
of  G^(uj)  from  data  with  a  3%  noise  level;  there  is  now  a  25%  maximum  deviation 
from  the  true  value.  Reconstructions  from  data  with  5%  noise  levels  were  still 
recognizable;  but  at  10%  noise  levels  the  error  (measured  in  the  energy  norm) 
in  the  reconstruction  exceeded  50%.  Figures  3  and  4  show  (see  open  circles) 
the  extrapolation  of  two  sample  reconstructions  from  Figure  1.  For  comparison 
the  true  g^(v)  is  also  shown  as  the  solid  line.  Because  G^_  and  g^  are 
even  functions  of  their  arguments,  the  graphs  have  been  plotted  only  for  the 
negative  values  of  the  respective  argxanents. 

Figure  5  and  6  show  four  realizations  of  (w)  from  data  with  3%  noise 
level.  Extrapolations  of  three  of  these  reconstructions  appear  in  Figures  7-9, 
with  the  true  92^^^  shown  as  the  solid  line.  Although  the  reconstructions  of 
(u)  appear  to  be  more  accurate  than  those  of  Gj^(w)  ,  the  extrapolation  to 
g2(v)  is  less  accurate  than  those  to  g^^iv).  This  is  due  to  the  discontinuity 
in  slope  of  G^ (w)  at  the  origin.  Since  the  approximations  to  G^ (w)  are 
sums  of  small  numbers  of  smooth  eigenfunctions,  they  do  not  approximate  G2(0) 
well  as  the  graphs  indicate.  This  low  frequency  error  in  the  oj  domain  is 
then  transformed  into  high  frequency  errors  in  the  v  domain,  resulting  in 
errors  in  the  extrapolation. 


The  analysis  used  in  defining  the  "trustworthy"  component  of  G,  and  a 
trustworthy  Galerkin  approximation,  is  tMOfiht  COie  in  nature.  It  is  therefore 
reasonable  to  supply  the  algorithm  with  smaller  e  or  larger  6  than  a 
pfiLofiL  noise  levels  suggest  (as  was  done  here)  or  use  a  different  filter  in 
Equation  (3.25).  However  this  cannot  be  made  precise  without  some 
statistical  statements  on  expected  errors. 
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TABLE  2:  Distribution  of  singular  values,  and  required  basis  size  for 

desired  accuracy  when  e  =  .002 


c 


N(e,c) 


L^{e,c) 


(e,c) 


L^  (e,c) 


TABLE  3. 


Distribution  of  L(e,c)  for  approximations  based  on  Gaussian 
quadrature . 


c 

L(.02,c) 

L{.002,c) 

.5 

4-8 

4-8 

1.0 

8-12 

8-12 

1.5 

14-18 

18-22 

2.0 


30-34 


30-34 


TABLE  4  ; 


Four  Sample  Realizations  of  Reconstructions 


with  1%  Noise. 


U) 

-1.0000 

-  .9474 

-  .8947 

-  .8421 

-  .7895 

-  .7368 

-  .6842 

-  .6316 

-  .5789 

-  .5263 

-  .4737 

-  .4211 

-  .3684 

-  .3158 

-  .2632 

-  .2105 

-  .1579 

-  .1053 

-  .0526 
0 


G(a)) 

G  (co) 

•  983 

.973 

1.  009 

1.010 

1.013 

1.021 

1.004 

1.017 

.991 

1.007 

.980 

•  997 

.974 

.990 

.975 

.989 

.980 

.993 

.988 

1.000 

.999 

1.008 

1.007 

1.015 

1.013 

1.019 

1.014 

1.019 

1.011 

1.015 

1.005 

1.008 

.997 

.999 

.988 

.990 

.982 

.983 

.978 

.979 

G(a)) 

G(a)) 

1.099 

.907 

1.043 

1.005 

.995 

1.046 

.962 

1.050 

.944 

1.035 

.942 

1.013 

.952 

.993 

.971 

,980 

.995 

.976 

1.017 

.979 

1.035 

.988 

1.046 

.998 

1.048 

1.007 

1.042 

1.012 

1.028 

1.014 

1.008 

1.012 

.987 

1.007 

.968 

1.001 

.953 

.996 

.945 

.993 

ABSTRACT 


Phase  retrieval  problems  are  ill-posed,  however  previous  analysis  has 
focused  upon  global  ill  conditioning  due  to  the  existence  of  multiple  exact 
solutions.  In  this  paper  we  consider  the  effects  of  local  ill  conditioning 
due  to  the  presence  of  large  infinite  dimensional  neighborhoods  of  any  exact 
solution  where  members  are  all  possible  solutions  provided  there  is  any  un¬ 
certainty  in  the  data.  The  form  of  such  neighborhoods  can  be  characterized  by 
viewing  phase  retrieval  as  a  nonlinear  extension  of  the  linear  problem  of 
inversion  of  the  finite  Fourier  transform  considered  in  the  previous  companion 
to  the  present  work.  In  particular,  we  are  able  to  estimate  the  essential 
dimension  of  phase  retrieval  problems,  i.e.  the  nimber  of  parameters  in  a 
solution  representation  that  can  be  determined  accurately  given  specified 
error  levels  in  the  data.  Based  on  these  results  a  modification  of  the  widely 
used  Gerschberg-Saxton  algorithm  is  proposed,  analyzed,  and  then  used  as  a 
basis  for  development  of  more  sophisticated  algorithms.  Numerical  results 
are  presented  on  the  performance  of  these  algorithms  on  one-dimensional 
problems.  The  results  indicate  that  although  the  algorithms  may  be  tuned  to 
overcome  local  ill  conditioning,  good  solutions  in  one  dimension  are  still 
difficult  to  find  numerically  because  of  global  ill  conditioning.  However, 
the  material  in  the  Appendix  indicates  that  in  higher  dimensions  global  ill 
conditioning  is  considerably  reduced  so  that  the  algorithms  should  be  effective 
in  such  problems. 


Ill 


1 .  INTRODUCTION 


In  our  companion  paper  [1] ,  we  introduced  the  problem  of  numerical  recon¬ 
struction  of  a  partially  known,  bandlimited  Fourier  transform  pair  g,  G  and 
considered  in  detail  the  model  linear  problem 

Given  values  over  a  finite  interval  ^  function  g(v) 

known  to  be  the  Fourier  transform  of  a  function  G(ciJ)  with  support 
contained  in  the  finite  interval  (bj^,b2],  find  G(a)). 

The  paper  focused  on  the  ill  posed  nature  of  this  problem  and  its  effects  on 
numerical  solution,  an  algorithm  (filtered  singular-value  decomposition)  was 
presented  that  could  be  specially  tailored  to  cope  with  these  effects.  In 
this  paper  we  consider  the  associated  model  nontine.(VL  problem  of  pk(l6e. 
-'let'u.evad. 

The  model  problem  in  phase  retrieval  is 

Given  value  m(v)  over  a  finite  interval  [aj^,a2]  of  the  modulus  of  a 
function  g (v)  known  to  be  the  Fourier  transform  of  a  function  G (w) 
with  support  contained  in  the  finite  interval 

knowledge  that  G  is  a  member  of  some  set  B,  find  g(v)  and  G(u)). 

The  set  B  represents  available  prior  knowledge  on  G.  In  this  paper  we 
study  three  particular  cases:  no  prior  knowledge,  G  is  nonnegative,  and 
]G(a>))  =n(co)  is  known  over  [b^,b2].  Again  our  focus  is  on  the  ill  posed 
nature  of  phase  retrieval  and  the  resulting  implications  for  numerical  re¬ 
constructions.  In  particular  we  generalize  the  iterated  projection  algorithm 
of  Gerschberg  and  Saxton  [2]  in  such  a  fashion  so  that  it  may  be  identified 
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as  a  first  order,  gradient  type  optimization  algorithm.  We  then  propose 
second  order,  Newton  type  extensions  of  the  algorithm  and  show  that  the  linear 
problem  arising  at  each  step  has  behaviour  dominated  by  the  behaviour  of  the 
linear  problem  [1] . 

The  structure  of  the  paper  is  as  follows.  Section  2  sets  out  the 
notation  employed  in  the  paper  and  lists  the  particular  model  problems  used 
in  illustrative  numerical  calculations.  Section  3  contains  a  discussion  of 
the  ill  conditioning  of  phase  retrieval,  focusing  on  the  two  causes: 
possible  multiple  solutions  and  ill  conditioning  of  the  underlying  linear 
operator  (the  finite  Fourier  transform) .  In  the  Appendix  we  show  that 
although  in  one  dimension  multiplicity  of  solutions  presents  a  serious 
problem,  in  two  or  more  dimensions  its  occurrence  is  pathologically  rare. 
Section  4  is  devoted  to  a  generalization  of  the  Gerschberg-Saxton  algorithm 
through  restatement  of  the  phase  retrieval  problem  as  one  of  finding  the 
closest  point  to  several,  possibly  disjoint,  sets.  Two  particular  algorithms 
are  developed  that  use  iterated  projections;  their  convergence  properties  are 
investigated  and  their  identification  with  the  well  known  steepest  descent 
algorithm  is  established.  In  Section  5  a  number  of  second  order  algorithms 
are  developed  from  those  of  Section  4.  Each  algorithm  requires  the  solution 
of  an  ill  posed  linear  problem  at  each  iteration;  we  propose  that  this 
solution  be  found  by  filtered  singular  value  decomposition.  In  order  to 
save  on  the  cost  of  an  expensive  decomposition  calculation  at  each  step,  we 
indicate  how  a  partial  filtering  can  be  obtained  using  a  block  decomposition 
of  the  linear  system  and  a  precalculated  singular  value  decomposition  of  the 


finite  Fourier  transform  (fFT)  that  is  constant  across  iterations.  Finally, 
Section  6  contains  numerical  results  showing  the  application  of  these  algo¬ 
rithms  to  each  of  the  model  phase  retrieval  problem.  Tables  showing  the 
relative  performance  of  the  various  algorithms  and  graphs  of  the  approximate 
solutions  generated  are  presented  and  discussed. 

The  results  indicate  that  in  one  dimension  the  multiplicity  of  solutions 
induces  poor  global  conditioning  so  that  the  significant  improvement  of 
second  order  methods  over  first  order  methods  is  restricted  to  small  local 
neighborhoods  of  the  solution.  Therefore  the  sin^jlest  algorithms  appear  to 
be  the  most  cost  effective,  however  we  feel  that  in  higher  dimensions,  with 
the  increased  likelihood  of  unique  solutions,  that  higher  order  methods  will 


come  into  their  own. 


2. 


SOME  MODEL  PROBLEMS  IN  PHASE  RETRIEVAL 


2 

The  analysis  of  this  paper  is  set  in  the  function  space  L  OR)  with  the 

standard  inner  product  (•»•)  and  norm  II  -  II .  The  Fourier  transform 
2  2 

:L  OR)  L  (R)  is  here  defined  to  be 


=/ 


g(v)  =  {^q){^)  =1  (iju)  du- 


Subsets  of  L  OR)  will  be  denoted  by  A.  or  B., 

1  S 

2 

P^:  L  OR)  ■‘■A  onto  a  pcUiCcCLLtdA.  set  A  is  defined 


(2.1) 


and  the  projection  operator 
by 


P  f  =  g  llf-gll  =  min  llf-hll  (2.2) 

hCA 

P^f  is  termed  the  projection  of  f  on  A,  projections  are  assumed  to  exist 
and  be  unique. 

The  interval  [-1,1]  will  be  denoted  by  1  and  [-c,c]  by  cl;  cl 

2 

will  also  be  used  to  denote  the  subset  of  L  OR)  of  all  functions  whose 
support  is  contained  in  [-c,c].  The  associated  projection  operator  is 
therefore 


(P  f)  (V)  =  f  (v)  , 

cl 

if 

v€  [-c,c] 

=  0 

if 

otherwise  . 

(2.3) 


It  will  be  abbreviated  to  P^  for  typographic  convenience. 

Three  model  problems,  variations  of  phase  retrieval  that  occur  in 
physical  problems,  will  be  used  as  numerical  illustrations  in  the  remaining 


sections.  They  are: 


1. 


Given  values  m(v)  over  the  interval  modulus  of  a 

function  g(v)  known  to  be  the  Fourier  transform  of  a  function  G((jj) 
whose  support  is  contained  in  the  interval  find  g(v)  and 

G(oj).  Some  relevant  references  are  [2,3]. 

II.  Given  values  m(v)  over  the  interval  modulus  of  a 

function  g(v)  known  to  be  the  Fourier  transform  of  a  real,  nonnegative 
function  G{oj)  whose  support  is  contained  in  the  interval  [b^.b^lf 
find  g(v)  and  G(tjj).  Some  relevant  references  are  [4.5]. 

III.  Given  values  m(v)  over  the  interval  modulus  of  a 

function  g(v)  known  to  be  the  Fourier  transform  of  a  function  G(uj) 
whose  support  is  contained  in  the  interval  whose  modulus 

n(w)  is  given  over  [b^.b^l*  find  g{v)  and  G(u>).  Some  relevant 
references  are  [6,7]. 

These  two  moduli  m(v)  and  n(a))  may  only  be  known  to  within  some  accuracy 
e,  that  is  meoAuAemeitti  m  and  n  are  available  such  that 

llm-mll  ,  lln-nll^e  .  (2.4) 

In  this  setting  all  problems  and  errors  are  invariant  under  translation 
and  scaling;  so,  as  in  the  previous  paper  [1] ,  the  sets  '  ^^i'^2^ 

can  be  transformed  into  cl  where 

c  =  I  [(a2- a^^)  (b^-b^)]^'^^  .  (2.5) 

Upon  defining  the  sets 
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A  =  {g: 1 (P^g) (v) I  =  (P^m) (v) } 

S  cl 

=  cl  n  {G:G  >0}. 

B^  S  cl  n  {G:  I  (P^G)  {(*))  1  =  (P^n)  (w)  } 
the  i-th  problem  (i  =  1,2,3)  cam  be  rewritten  as; 


Find  g,G  such  that  g  =  ^G,  g€A,  G6B. 


3.  THE  ILL-POSED  NATURE  OF  PHASE  RETRIEVAL 

In  the  previous  paper  we  indicated  that  the  phase  retrieval  problem  is, 
in  general,  ill-posed  in  that  it  fails  to  satisfy  Hadamard's  definition  of  a 
well-posed  problem  (the  de'finition  is  repeated  here  for  convenience)  . 

Ve.^^yuXion:  a  problem  is  well-posed  if  the  solution 

a)  exists 

b)  is  unique 

c)  depends  continuously  on  the  data. 

If  ciny  of  these  conditions  are  violated  it  is  ill-posed. 

We  now  consider  the  ill-posed  and  ill  conditioned  nature  of  the  phase 
retrieval  problem  in  more  detail  with  particular  regard  as  to  the  consequences 
for  numerical  solutions.  We  wish  to  show  that  it  is  a  failure  of  condition  c 
that  is  the  main  source  of  difficulty. 

To  illustrate  the  point  we  briefly  review  the  prototypical  linear 
problem  and  its  solution  as  described  in  the  previous  paper,  the  inversion  of 
a  compact  linear  operator  by  use  of  its  filtered  singular  value  decomposition 
(SVD)  .  Let  H  be  a  separable  Hilbert  space  and  be  a  compact 

linear  operator  with  an  SVd{4)^ ,0^ .  If  g  is  an  arbitrary  member  of  H 
with  the  expansion 

00  00 

g  =  52  '  52  (3.1) 

i=0  i=0 

then  the  equation  Mg  =  g  has  the  formal  solution 
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The  solution  G  is  in  H  if  and  only  if  7.  „  la’. 0.  |  <  “.  Because 

^1=0  'll' 

the  singular  values  0^  tend  to  zero  this  is  a  stronger  condition  on  the 
coefficients  a^  than  Eq.  (3.1),  so  a  solution  does  not  exist  for  every  right 
hand  side  g.  Thus  the  problem  is  ill-posed  with  respect  to  existence  of 
solutions.  The  solution  is  unique  if  and  only  if  C/C  has  a  non-trivial  null 
space,  which  is  true  if  and  only  if  0^>0  Vi.  The  solution  does  not  depend 
continuously  on  the  data.  If  a  small  perturbation  £  is  made  in  a  high 
frequency  component  a^  of  g  then  a  perturbation  (£/0^)  is  induced  in  G; 
this  can  be  made  as  large  as  desired  by  increasing  i.  Thus  for  any  given  £, 
6  and  solution  pair  g,  G  there  exists  an  infinite  dimensional  set  of 
solutions  g,  G  such  that 

JTG  =  g,  llg-gll<e,  1Ig-GI1>6  .  (3.3) 

It  is  violation  of  condition  c  that  maJces  numerical  inversion  of 
compact  operators  so  difficult.  In  problems  arising  from  physical  systems 
existence  of  a  solution  is  a  prerequisite  for  making  the  measurements; 
failure  of  the  siibsequent  mathematical  model  to  have  a  solution  is  usually 
due  to  measurement  errors  in  g  or  CfC.  Thus  non-existence  is  really  a 
consequence  of  discontinuous  dependence  and  unimportant  in  its  own  right. 
Likewise  the  existence  of  multiple  iXClCt  solutions  is  not  in  itself  a  problem. 
In  most  cases  the  nullspace  can  be  predetermined  theoretically  and  the 
operator  restricted  to  the  complement  of  this  space.  If  the  restricted 


operator  had  a  bounded  inverse,  numerical  inversion  would  present  no  further 
difficulties . 

Solution  by  filtered  SVD  directly  addresses  the  problem  of  discontinuous 
dependence  on  the  data  by  identifying  the  infinite  dimensional  subspace  S 
over  which  CK  produces  distinct  (but  very  small)  variations;  then  restricting 
the  approximate  solutions  to  the  complementary  subspace  S'*"  over  which  CfC  ^ 
is  uniformly  continuous.  This  decomposition,  effected  by  the  filter, 
corresponds  to  choosing  the  first  N  terms  in  a  generalized  Fourier  expansion 
of  the  solution  G;  as  such  it  has  a  wide  variety  of  uses  and  interpretations. 

In  the  nonlinear  problem  of  phase  retrieval  the  traditional  investiga¬ 
tions  of  ill  conditioning  and  ill  positioning  have  concentrated  on  the 
questions  of  existence  and  uniqueness  I  3  ] .  Necessary  and  sufficient 
conditions  that  m  eind  g  must  satisfy  for  existence  of  a  solution  G  are 
given  by  the  Paley-wiener  theorem.  For  one-dimensional  retrieval  problems  the 
form  and  existence  of  multiple  solutions  is  a  consequence  of  the  Hadamard 
factorization  theorem  (a  result  first  derived  by  Akutowicz  (  8  ,  9  J .  An 
extension  of  this  theorem  gives  necessary  conditions  for  existence  of  multiple 
solutions  in  two-dimensional  problems.  A  detailed  exposition  appears  in 
Appendix  A. 

Although  questions  of  existence  and  uniqueness  have  some  influence  on 
the  behavior  of  numerical  algorithms,  as  with  the  linear  problem  discontinuous 
dependence  on  the  data  dominates.  Existence  or  non-existence  of  solutions  is 
again  just  a  simple  consequence  of  discontinuous  dependence.  If  multiple 
solutions  exist  but  are  uniformly  separated,  their  existence  will  influence 


the  initial  iterations  of  any  algorithm  but  the  final  behavior  will  be  deter¬ 
mined  by  the  question  of  whether  the  problem  is  well  posed  in  a  neighborhood 
of  each  solution  or  not.  As  we  shall  show  the  phase  problem  is  never  well 
posed. 

However  existence  of  multiple  solutions  may  be  a  stronger  contributor 
to  ill  conditioning  than  indicated  above.  As  shown  in  the  first  section  of 
Appendix  A,  in  the  one-dimensional  problem  there  may  exist  an  infinite 
sequence  of  exact  solutions  with  a  limit  G^,  corresponding  to  a 

sequence  of  finite  products  of  Blaschke  factors  and  the  limiting  infinite 
product.  In  this  case  the  solutions  are  not  uniformly  separated  and  any 
algorithm  will  have  difficulties  in  the  neighborhood  of  G^.  Fortunately, 
for  the  reasons  outlined  in  Appendix  A,  in  two-dimensional  problems  multiple 
solutions  appear  to  be  rare  and  the  limiting  behavior  described  above  xonlikely 
in  the  extreme,  xinless  symmetry  considerations  reduce  the  problem  to  an 
essentially  one-dimensional  form. 

Proof  of  discontinuous  dependence  on  the  data  is  difficult  for  an 
arbitrary  nonlinear  equation  ^(G)  =g.  The  most  widely  used  criterion 
is  that  the  Frechet  derivative  D(G)  of  SB  at  the  point  G  be  a  compact 
linear  operator.  The  phase  retrieval  problem  may  be  formally  stated  as  that 
of  finding  a  solution  pair  to  the  equation 

5?(G)  =  [p^^P^gI  =  {G)  =  |P^gl  =  m  (3.4) 

where  the  nonlinear  operator  SB^  is  defined  by 


•2’  (g)  =  lg| 


(3.5) 


i 

I 

‘  so  that  SE  is  the  composition  («)  of  and  the  compact  linear 

operator  P  .  The  Frechet  derivative  D  (g)  of  SE  (g)  is  the  bounded 
c  c  1  i 

linear  operator  defined  by 

I  D^(g)h  =  Re(g*.h)/lg|  .  (3.6) 

The  Frechet  derivative  D(G)  of  i?(G)  is  the  operator  D,  (G)  <>  P  ;  since 
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this  is  the  composition  of  a  bounded  operator  with  a  compact  operator  it 
follows  that  D(G)  is  compact  and  the  phase  retrieval  ill-posed. 

Since  D(G)  varies  with  G  the  decomposition  by  filtering  of  the 
underlying  space  into  a  subspace  S(G),  over  which  D(G)  ife  slowly  varying, 
and  its  complement  S‘*‘(G)  ,  is  not  a  global  decomposition.  Instead  S(G)  and 
S'*‘(G)  vary  with  G  and  serve  as  tamgent  planes  in  the  definition  of  mani¬ 
folds  and  over  which  ^(G)  is  slowly  varying  or  i?  ^  is  uni¬ 
formly  continuous.  However  as  is  not  linear,  generation  of  an  approxi¬ 
mate  solution  G  does  not  convey  the  information  that  is  contained  in 

generation  of  an  approximate  solution  from  a  linear  subspace  S'*";  and  is  also 
a  much  harder  problem. 

The  usual  approach  taken  to  overcome  this  problem,  and  to  deal  with 
measurement  noise,  is  to  reduce  the  ill  posed  problem  to  a  well  posed  problem 
by  regularization;  that  is  a  parameter  X>0  and  functional  f^(G)  are 
chosen  and  the  regularized  solution  found  by  minimization  of 

II  i?{G)  -  yll  +  AQ(G)  (3.7) 

is  taken  as  an  approximation  to  G.  As  stated  the  parameter  X  and 
functional  do  not  appear  to  depend  on  measurement  noise  or  the  problem 
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form  in  any  obvious  way,  but  the  following  theorem  due  to  Tikhonov  [10] 
elucidates  their  relationship 

J:  Under  certain  mild  conditions  on  G  and  g,  then 

minimizes  II  (G)  -  gll  +Xf^(G) 

if  and  only  if 

a)  there  exists  ^  such  that 

G^  minimizes  II  (G)  -  gll  subject  to  f2(G)  ^5^ 

b)  there  exists  ^  ®  such  that 

G^  minimizes  fi(G)  siibject  to  II  i?  (G)  -gll^e^  . 

This  approach  is  explicit  or  implicit  in  memy  of  the  algorithms  presented  in 
the  literature  [ 11] .  ^ 

Solution  by  regularization  does  not  identify  an  approximating  subspace, 
or  even  a  submanifold,  such  as  that  found  by  filtering;  settling  rather  for 
the  (less  informative)  identification  of  sets  =  {G:  IIl(G)  -gll<e}  and 
Ag  = {G:  n(G)  <6}  that  contain  the  solution  G.  However  the  phase  problem 
does  have  a  natural  decomposition  of  the  solution  space  into  subspaces  rather 
than  submanifolds,  encouraging  solution  by  filtered  SVD  rather  than  by 
regularization.  This  follows  from  the  form  of  D(G) .  Upon  definition  of  the 
subspace 

T  =  {h:  arg  h  =  arg  g}  (3.8) 


it  is  easily  shown  that 


Consequently  D^(g)  has  a  well  defined  null  space  T  and  a  bounded  inverse 
on  the  complement  T.  Therefore  the  ill  conditioning  of  D(G)  is  essentially 
due  to  the  operator  studied  in  the  previous  paper.  This  suggests 

that  the  global  decomposition  of  H  into  linear  subspaces  S  and  used 

there  in  inversion  of  P^jTP^G  =  g  by  filtered  SVD  will  also  be  appropriate 
in  the  nonlinear  problem.  This  idea  is  further  developed  in  Section  6. 


4. 


ITERATIVE  METHODS 


The  formulation  of  phase  retrieval  as  the  search  for  a  common  inter¬ 
section  point,  given  in  Equation  (2.7),  suggests  use  of  successive  projection 
algorithms  for  its  solution.  The  simple  structure  of  the  projection  operators 
and  Pg  for  the  sets  A  and  appearing  in  Eq.  (2.7)  indicates  that 

this  class  of  algorithms  will  be  computationally  efficient.  Such  algorithms 
are  presented  for  the  general  problem  in  [12]  and  were  first  applied  to  phase 
retrieval  by  Gerschberg  and  Saxton  [  7,13].  However  these  algorithms  were 
originally  derived  eind  analyzed  under  conditions  such  as  existence  of  am  inter 
section  point,  so  they  should  not  be  applied  directly  to  the  phase  problem  due 
to  the  presence  of  ill  conditioning.  We  therefore  propose  and  discuss  modi¬ 
fications  more  suited  to  the  ill-posed  nature  of  this  problem. 

4.1  Thz  Ge.neAic  AlgofuOthm 

Let  B.  be  a  collection  of  M  sets  with  associated  projection  operators 
^  M 

P..  Since  ill  conditioning  implies  that  D  B.  can  be  empty,  we  attempt  to 

^  i=l  ^ 

find  an  x  that  is  closest  to  all  sets  B.  and  which  reduces  to  a  common 

1 

intersection  point  -tiJ  onz  .  Therefore  we  seek  to  minimize 

M 

F(x)  =  y*  llx-P.xil  (4.1) 

iTl 

The  simplest  iterative  algorithms  for  minimization  of  F(x)  can  be 
considered  as  pointwise  approximation.  For  an  arbitrary  point  x  the  points 
b.  =P.x  are  point  approximations  to  sets  B.  and  induce  an  approximation 


to  F (x) .  The  algorithm  then  constructs  a  sequence  x  ,  where  x  ,  is 

n  n+1 

determined  from  x  after  minimization  of  G(z,x  ). 

n  n 


Lemma  1:  G(z,x)  has  a  unique  minimum  at 
then  G  (z  +  X  (y  -  z)  ,x)  <  G  (z  ,x)  VX  €  (0,2). 


y  =  M 


“1  rM 


rM 

Z.i=l 


P.x 

1 


and 


if 


Paoo^: 


H  2  ” 


G(z,x)  =  53  lib.  II  -2  53  (b.,z)+M 


i=l 

M 


i=l 


M  h 

^  53  llb.ll^-2M  ||2  -j^l  •  llzll  +Mllzll^ 

i=l  ^  i=l 


The  quadratic  in  llzll  is  minimized  at  lIzH  =  llyll,  and  the  inequality  is 
strict  unless  z=y.  Moreover 

M  „  M 


l(z  +  X  (y-z)  ,x)  =  53  ^  (y  -  z,b.  -  z)  +  X  Mlly  -  zil 

i=l  ^  i=l  ^ 

=  53  -zll  -  (2-X)XMlIy-zll^ 

i=l  ^ 

M 


<  53  11^-  “^11  “  G(z,x) 

i=l  ^ 


since  z  y  and  X6  (0,2)  imply  that  X  (2-X)  lly-zll  >0. 


Pfiopoi'it^on  /:  The  sequence  {x  }  defined  by  the  iteration 


satisfies 


X  =  X  +  A  (y  -X  ),  X  £  (0,2) 

n+1  n  n  n  n  n 


F(x  )  <F{x  )  or  X  =  X 
n+1  n  n+1  n 


If  the  projection  operators  are  single  valued  then  x  ,  =  x  implies  that 

n+1  n 

X  =x  Vk.  If  the  sequence  x  has  a  limit  point  x  at  which  the  pro- 
n  ■  jc  n  n 

jections  are  continuous  and 

0  <  lim  inf  X  <  lim  sup  X  <2 
n  n 


then  X  is  a  fixed  point  of  the  iteration. 
PfLOO^: 


^<Vi' 


Ell  X  ^  -  P  .  X  ,11 
1=1  ^ 


^  E  -P'X  II 

^  n+1  1  n 


=  G(x  +  X  (y  -z  )  ,x  ) 
n  n  n  n  n 

<  G(x  ,x  )  =  F(x  )  . 

n  n  n 

From  Lemma  1  the  last  inequality  is  strict  if  Therefore 

F(x  , )  =F(x  )  if  and  only  if  y  =  x  if  and  only  if  x  =x  implies 

n+1  n  ■'  ■'n  n  ^  n+1  n 

consequently  x  be  a  limit  point  of 

x^,  if  X  is  not  a  fixed  point  then  lly  -  xll  =e>0.  By  continuity  of  II  •  I 


p. 

1 

at 

X ,  then 

a) 

lim 

b) 

lim 

ynv  =  y 

(4.8) 

(4.9)  ' 


(4.10) 


These  inequalities  imply 

F(x^^l)  <  G(Xj^l,x^)  =  G(x^,x^)  -X(2-Xj^)Mlly^-x^I|2 

<  F(x„)  -  \  a^Me  (4.11) 

N  Z 

^  <  f(5) 

But  as  F(x  )  is  always  decreasing,  this  implies  the  contradiction 
n 

lim  F(x„  )  <  F(x) . 

Existence  of  limit  points  and  convergence  of  the  algorithm  from  an 
arbitrary  x^  cannot  be  guaranteed  without  further  rather  restrictive  global 
conditions  such  as  compactness  of  the  sets  and  continuity  of  the  pro¬ 

jections  P^.  However  examples  showing  failure  of  convergence  after  violation 
of  these  conditions  are  somewhat  pathological;  in  practice,  although  the 
conditions  may  not  be  met,  the  algorithm  almost  always  converges.  Apparent 
failures  are  usually  traced  to  ill  conditioning  not  to  violation  or  near 
violations  of  these  conditions,  therefore  we  assume  henceforth  the  existence 
of  limit  points,  leaving  determination  of  sufficient  conditions  for  this 
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existence  to  the  analyst,  and  turn  instead  to  an  alternative  characterization 
of  these  limit  points  and  the  algorithm  itself. 

4.2  Succe^-itve  P-tojecttoni  oi  Stezpe^t  VoAaznt  AtgonAXhim, 

To  demonstrate  that  the  generic  algorithm  of  Proposition  1  is  the 
standard  steepest  descent  algorithm  applied  to  F(x)  we  need  the  following 
definition:  Let  f:H-^]R  be  a  real  valued  function  on  a  Hilbert  space  H. 
Then  the  gradient  Vf(x)  €  H  of  f  at  x  is  y  if  and  only  if  y  is  the 
unique  vector  satisfying 

lim 

HzIKb 

2 

The  gradient  Vf(x)  is  the  sum  of  the  gradients  of  llx-P^xll  .  We  next 
require 

Pfiop06yLtLon  2.  if  there  exists  a  X  >  1  such  that 

P, (P,x  + X (x-P,x) )  =  P,x  (4.13) 

A  A  A  A 

and  P^x  is  single  valued,  then 

V(llx-P  xll^)  =  2(x-P,x)  .  (4.14) 

A  A 

PAOO^!  It  suffices  to  show  that  for  scsne  constant  c  the  following 
inequality  holds 

I  lly  “  -  2  (x  -  P  x,y  -  x)  I  ^  cllx  -  yll^  .  (4. 15) 

n  n 

We  first  note  the  pair  of  inequalities. 
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Ily-P,yll  <lly-P,xll  =  lly  -  P,yll+2(y  -  P,y,P  y  -  P,x)  +  llP,x  -  P,yll 
A  A  A  AA  A  A  A 


2  2  2  2 
lly  -  P  xli  ^  lly  -  P  y  II  =  lly  -  P  x!l  +  2  (y  -  P  x  ,P  x  -  P  y )  +  IIP  x  -  P  y  II 
A  A  A  AAA  AA 


and  the  resultant  inequa^.ities 


2  (y  -  P.y  ,P,y  -  P,x)  >  -  llp,y  -  p^xir 
A  A  A  A  A 


2{y  -  P,x,P,x  -  P,y)  <-  llPy-P^xT 
A  A  A  A  A 


We  also  require  the  equality 


2  2 

lly  -  P^yll  ~  llx  -  P^xll  -  2(y  -  x,x  -  P^x) 


2  2 

=  l!x-yll  +IIP,x-P  yll  +  2  (y  -  x,P,x  -  P,y)  +  2  (x  -  P,x,P^x  -  P,y) 
A  A  A  A  AAA 


We  next  bound  llP^x-P^yll  in  terms  of  llx-yll.  By  the  hypothesis  of  the 


proposition 


IIp.x  +  X  (x  -  P,x)  -  P,yll^  >  llp,x  +  X  (x  -  P  x)  -P,xll^ 
A  AAA  A  A 


=  X^llx-P  xll^ 
A 


which  in  turn  implies 


llP,x-Pyll  ^2X(P,x  -  x,P,x  -  P  y) 
A  A  A  A  A 


Now  use  of  Equations  (4.18)  and  (4.21)  gives 


IIP,x-P,yll  <  2  (P.x  -  y  ,P,x  -  P,y) 
A  A  AAA 


=  2 (P^x  -  x,P^x  -  P^y)  +  2(x  -  y,P^x  -  P^y) 


<  Y  llP,x-P  yll^  +  2llx-yllllP,x-P,yll 
A  A  A  A  A 


Consequently 


(4.23) 


IIP  x-P  yll  <  llx-yll 
A  A  A-1 

which  is  the  sought-for  result.  To  establish  the  lower  bound  on  the  inequality 
in  Eq.  (4.15),  reverse  x  and  y  in  Eqs.  (4.1b)  and  (4.17),  substitute  the 
new  Eq.  (4.17)  into  Eq.  (4.19)  and  use  Eq.  (4.23)  to  give 

lly  -  P  yll^  -  llx  -  P.xll^  -  2  (y  -  x,x  -  P^x)  >-  llx-yll^  .  (4.24) 

A  A  A  A-1 

For  the  upper  bound,  we  note  that  the  left  hand  side  of  Eq.  (4.15)  can  be 
written  as 

« 

-  lly  -  xll  ^  -  lip  y  -  P.xll  ^  -  2  (x  -  y  ,P  y  -  P^x)  -  2  (y  -  P  y  ,P  y  -  P^x) 

A  A  A  A  AAA 

-  2 (x  -  y,y  -  X  -  (P^y-P^x) )  (4,25) 

upon  using  Eq.  (4,19)  with  x  and  y  reversed.  Further  substitution  of 
Eqs.  (4.17)  and  (4.23)  yields 

lly  _  P  yll^  -  llx  -  P-xll^  -  2  (y  -  x,x  -  P.x) 

A  A  A 

^-lly  -  xll^  -  lip  y  -  P-xll^  +  2llx  -  yll  llp  y  -  P,xll  (4.26) 

A  A  A  A 

+  llP.y  -  Pj.xll^  +  2  llx  -  yll  [lly  -  X II  +  IIP  y  -  P.xll  ]  ^  llx-yll^ 

A  A  A  A  A*i 

which  completes  the  proof. 

This  result  is  sufficient  to  show  that 

M 

VF(x)  ~  ^  (x-P.x)  =  2M(x-y)  .  (4.27) 

i=l  ^ 

Hence  the  directions  y  -  x  used  in  the  algorithm  of  Proposition  1  are 

n  n 

steepest  descent  directions  for  F(x).  Furthermore  the  result  that,  at  a  limit 
point  x,y  =  x  shows  that  VF(x)=0  so  x  is  by  definition  a  stationary  point  of  F(x). 
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4.3  Tke.  R&i>VUcte.d  Projection  Atgorithm 

In  many  reconstruction  problems  considerations  such  as  computational 

complexity  and  firm  restrictions  on  the  functions  g  and  G  argue  for  the 

iterates  being  restricted  to  one  particular  set,  say.  We  now  show  that 

the  natural  modification  of  the  generic  algorithm  still  produces  a  sequence 

with  decreasing  function  values  whose  limit  points  are  stationary  points  of 

F(x),  with  X  restricted  to  B.,. 

M 

PropO^iXton  3.  if  the  projections  are  continuous  and  unique  then  the 


sequence  x^  €  B^  defined  by 


,  M-l 


■n  (M 


n+1  M  n  n  n  n  n 


(4.28) 


satisfies  F(x  ,)  <F(x  )  or  x  .  =x  for  all  k.  Furthermore  if  x  is  a 
n+1  n  n+k  n 

limit  point  of  x  ,  then  Vf(x)  is  normal  to  the  set  B„. 

n  M 


P/loOjJ; 


The  first  step  is  to  establish  a  bound  on  (x  ,  -  x  ,  y  -  x  ) 

n+1  n  n  n 


2  =  x+X(y-x) 
n  n  n  n  n 


(4.29) 


II X  ,  -  z  11^  =  II X  ,  -  X  11^  +  llx  -  z  11^  +  2  (x  -  X  ,x  -  z  ) 
n+1  n  n+1  n  n  n  n+1  n  n  n 


(4.30) 


2(x  -  X  ,x  -  z  )  =  2X  (x  -  X  ,y  -  x  ) 

n+1  n  n  n  n  n+1  n  n  n 


=  llx  ,  “  X  11^  +  llx  -  z  11^  -  llx  -  -  z  11^  .  (4.31) 

n+1  n  n  n  n+1  n 
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we  have  that 


Since  Pz  =  x 

M  n  n+1 


Therefore 


2  (x  -  X  ,y  -  X  )  ^  T —  llx  -  X  II 
n+1  n  -^n  n  A  n+1  n 

n 


(4.32) 


M-1 

F(x  ^  IIp.x  -  X  1 
n+1  in  n+1 


i=l 


M-1  M-1 

=  52  IIp.x  -  X  In  +  2  52  (P.x  -  X  ,x  -  X  ^, )  +  (M-1)  llx  . ,  -  x  I 
in  n  "  in  n  n  n+1  n+1  n 

1=1  1=1 


=  F(x  )  +  (M-1) (x  -  X  ,2y  -  x  -  x  ) 
n  n  n+1  n  n  n+1 


=  F(x^)  +  (M-l)[2(x^-x^^^,y^-xJ  +  llx^-x^^^ll  ] 


where  the  last  line  follows  from  Eq.  (4. 32) ;  consequently 


(4.33) 


F(x  <  F(x^) 
n+1  n 


(4.34) 


since  A  <1.  Furthermore  the  inequality  is  strict  unless  x  .  =x  .  Unique- 
n  n+1  n 

ness  of  projections  ensures  that  ^u+1  ~  ^n  that  *n+lc  ~  ^n  ^ 

be  a  limit  point  of  x^ ,  then  continuity  of  implies  that  ^  6  and 

M-1 

VF(x)  =  ^  2(x-P.x)  =  2(M-l)(x-y)  .  (4.35) 

i=l  ^ 

Since  x  is  a  fixed  point  of  the  iteration  (the  same  arguments  as  used  in 

the  proof  of  Proposition  1  shown  this)  then  ^  ^®  sphere 

S  =  {z:  llz  -  yll  <  llx  -  yll}  must  satisfy  snB„  =  (f.  But  the  sphere  has  a  tangent 

plane  at  x,  so  B..  also  has  a  tanaent  plane  there;  since  y -""x  is  the  normal 
M 

to  this  tangent  plane  then  y  -  x  = VF (x)/2 (M-1)  is  normal  to  B  . 


If  the  set  B  is  convex,  then  the  condition  A  €  (0,1)  can  be  replaced  by 
M  n 

A  €  (0,2).  If  B„  is  a  linear  siibspace  then  P.,  is  a  linear  map,  so  for  any 
n  M  M 

A 


P  (x  +  A  (y  -  X  )  )  = 
M  n  n  n 


M  n  M  n  n  M 


(4.36) 


Thus  a  unique  search  direction  P..(y  ~x  )  is  defined  regardless  of  A.  A 

M  n  n 

line  search  can  be  used  to  find  the  minimum  of  F  (P.,x  +  AP.,  (y  -  x  ) )  as  a 

M  n  M  n  n 

function  of  A. 


4.4  Rz6tAA.cX.Q.d.  Pfioje.cX^ovu>  -in  Phase  ReXAievaZ 

The  restricted  projection  algorithm  is  especially  suitable  for  the  phase 

retrieval  problem  for  two  reasons.  The  first  reason  stems  from  the  requirement 

that  G (w)  €  cl ;  this  set  is  a  linear  subspace  and  so  is  a  natural  choice  for 

the  set  of  the  previous  subsection.  The  second  reason  is  that  such  a 

choice  is  efficient.  The  problem  as  stated  gives  information  on  g  and  G 

only  over  the  sets  cl,  we  will  show  that  the  restricted  projection  algorithm 

with  B„  =  cl  requires  knowledge  of  g  and  G  at  each  iteration  only  over 
n  n  n 

cl,  even  if  the  solution  g(v)  is  required  over  the  entire  real  line.  With 
the  restricted  projection  algorithm,  only  the  values  used  until  the 

iterates  converge.  At  this  point  P^g  can  be  extrapolated  to  the  whole  line. 
The  existence  of  efficient  algorithms  for  phase  retrieval  has  not  always  been 
realized,  several  authors,  (14],  have  proposed  schemes  requiring  storage  of 
values  of  the  iterates  from  oatsZde  the  interval  cl. 

To  show  the  efficiency  of  the  algorithm,  we  present  the  details  of  the 
restricted  projections  arising  in  model  problems  I-III.  Upon  defining  the  set 
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^  A  =  {G:  Q,  =  3  g,  g  €  a}  ,  it  follows  that  g£A  if  and  only  if 


_  1  _  _  1  2 
S'  g  =  G€^  A.  Since  the  Fourier  transform  preserves  L  norms,  P 


is  the  closest  point  in  ^  to  G  if  and  only  if  jTp  G  is  the 

S~  k 

closest  point  in  A  to  g,  therefore 


-1 V 

s  k  * 


P  g  is  easily  shown  to  be 
A 


(P^g) (v)  =  m(v)e 


i  arg  g  (v) 


=  g(v) 


V  e  cl 
otherwise 


Therefore,  after  noting  that  P  P  =P  P  P  and  restricting  G  to  cl 

C  A  C  A  C 


p  g  =  g  -  p  g  +  p  p  p  g 
A  C  C  A  C 


and 


=  G-P  S^~\  S'P  G  +  P  S^~^p  P,P  G 
c  c  c  c  c  A  c  c 


As  the  remaining  sets  are  by  definition  contained  in  cl ,  it 


follows  that  P  =  P„  •  For  the  record  these  projections  are 
c  B^ 


P„  G  =  P  G 
®1 


(Pp  G)  (uj)  =  (Re  G)  (OJ)  ,  if  (Re  G)  (w)  >  0,  w  €  cl 
=  0  ,  otherwise 


(PB3G)  (u) 


n(cu)e^  ,  if  u  G  cl 


0 


.  otherwise 


For  typographic  convenience  we  will  let  P  ,  ->P,, 

r” 1«  1  2 


Pp  ^Iso  we  will  denote  the  restricted  projection  algorithm  using  these 

3 


projections  and  the  set  = cl  by  RP.  Therefore  the  iterates  x  become 

M  n 


iterates  G  €cl.  Further  we  shall  denote  the  search  directions  y  -x  by 
n  -'n  n 


For  RP,  is  given  by 


1  ” 

=  -  VCPP-G-G) 

n  M  4^  c  1  n  n 
1=1 


(4.44) 


The  calculation  of  H  requires  values  of  g  and  G  only  over  cl  after 

n  n  n 


use  of  Eq.  (4.40)  for  P  P, . 

cl 


4.5  IZZ  CondiXiorUng  and  Linz  SzxVLchzi 

The  geometric  view  of  the  RP  algorithm  developed  in  this  section  sheds 

new  light  on  some  of  the  problems  encountered  in  previous  use  of  projection 

algorithms,  [  7 ] ,  in  particular  that  of  constant  but  small  decrements  in 

F(G^).  This  may  be  due  to  local  ill  conditioning  of  the  problem;  with  the 

geometric  interpretation  that  the  sets  emd  are  nearly  parallel  at 

either  intersecting  at  a  very  acute  angle  or  failing  to  intersect  at  all. 

Figure  1  shows  simple  two-dimensional  examples  of  such  problems .  In  both 

cases  shown  in  Fig.  1,  H  is  very  small  so  that  if  X  €  (0,2)  then  G  and 

G^^^  are  nearly  coincident  even  though  G^  is  far  from  the  true  minimum. 

The  relation  established  between  projection  methods  and  steepest 

descent  methods  opens  up  a  wide  range  of  algorithms  already  developed  for 

gradient  methods.  In  particular,  we  note  that  calculation  of  F(G  ) 

n 

requires  evaluation  of  so  we  may  simultaneously  calculate 


because 


Hn  =  P^H^.  (G,P^H)  =  (P^G,H)  .  (4. 

Therefore  a  quadratic  [15]  or  cubic  [16]  line  search  routine  will  improve 
convergence  beyond  simple  variation  of  A  in  (0,2). 


5. 


SECOND  ORDER  ITERATIVE  METHODS 


The  iterative  algorithms  for  minimization  of  F(G)  discussed  in  the 

previous  section  were  developed  from  first  order  truncations,  either  affroxi- 

mating  F(G)  by  its  gradient  Vf(G)  or  taking  pointwise  approximations  to 

the  sets  A  and  B, .  In  a  similar  vein  we  now  construct  second  order  algo- 

1 

rithms  by  approximations  involving  the  Hessian  of  F(G)  and  affine 

approximations  to  the  sets.  A  succession  of  such  algorithms  is  constructed, 
increasing  in  accuracy  and  in  complexity  up  to  the  standard  Newton's  method 
for  minimization  of  F (G) .  In  each  algorithm  a  linear  subproblem  must  be 
solved  at  each  stage;  typically  the  subproblem  is  ill-conditioned  and 
requires  a  filtered  inversion.  The  geometric  viewpoint  shows  that  the  system 
can  be  taken  to  be  a  linear  least  squares  problem  whose  solution  is  the 
closest  point  to  a  collection  of  affine  subspaces  approximating  the  sets  X 
and  B^.  We  then  demonstrate  that  in  this  new  form  the  system  may  be  pre¬ 
conditioned  so  that  a  filtered  inversion  is  possible  without  having  to 
calculate  an  expensive  singular  value  decomposition  at  each  stage. 

Unlike  pointwise  approximations,  the  affine  approximations  to  the  sets 
are  not  necessarily  contained  within  the  sets,  therefore  we  cannot  guarantee 

that  F(G  +Ah  )  <F(G  )  for  A€  (0,2).  However,  it  is  often  possible  to 
n  n  n 

show  that  the  search  directions  H^  generated  at  each  iteration  are  descent 

directions,  i.e.  (d/dX)F(G  + Ah  ),  _  < 0.  In  the  interests  of  brevity  only  the 

n  n  A=o 

form  of  the  algorithms  is  presented  here,  proof  of  these  and  other  similar 
results  are  left  to  the  reader. 


138 


5.1  The  Paitcal  A^^ine  Aigo'Uthrn 


As  an  introduction  to  the  use  of  affine  approximations  we  present  a 
simple  extension  of  RP.  In  RP  at  the  n-th  iteration  the  set  A  is 
approximated  by  the  point  so  that  the  restricted  projection 

is  given  by  Eq.  (4.40).  However  membership  of  g  in  A  is  determined  only 
by  the  modulus  values  of  g  across  cl;  outside  of  this  interval  there  are 
no  restrictions.  Therefore  the  affine  subspace 


A  E  {h:h  =  P  P^g  +  (g-P  g) ,  g  €  L  OR) } 
n  c  An  c 


(5.1) 


contains  the  point  ^^9^^  is  contained  in  A.  So  for  any  function  G  €  cl 


IIg  -  P. G  II  <  lb  -  P  G  II  =  lb  g 

n 


(5.2) 


n 


Use  of  A  and  the  points  P.G  as  approximations  to  sets  gives  an 
n  in 


approximation  to  F  such  that 


M 


F  (G  =  IIP  P,g  -  P  Glr  +  T]  llp.G  -Gli 
n  c  A’n  c  c  "  i  n 

1=2 


(5.3) 


F(G)  <  F  (G) ,  F(G  )  =  F  (G  ) 

n  n  n  n 


F  (G)  is  minimized  at  the  point  L  satisfying  the  normal  equations 
n  n 


(P  ^  +  (M-1)I)L 

c  c  c  n 


P^  ^P  P^g 
c  c  A  n 


M 

E 

i=2 


P.G 
1  n 


(5.4) 


giving  a  search  direction  H=L-G.  If  M>1,  Eq.  (5.4)  can  be  solved 

n  n  n 

any  regular  linear  equations  pacltage  since  the  eigenvalues  of 


by 


(P  JF  P^  +  (M-l)I)  are  all  greater  than  unity  and  less  than  M.  If  M  =  l, 
c  c  c 

then  minimization  of  reduces  to  the  least  squares  problem 

min  lip  G-P  P,g  11^  (5.5) 

C£cl  =  ' 

which  is  the  focus  of  the  previous  paper.  In  this  case  the  ill  conditioning 
of  the  finite  Fourier  transform  P ^  forces  the  use  of  a  filtered  approxi¬ 
mation  L  to  L  in  which  L  is  the  projection  of  L  onto  the  span  of 
the  first  few  eigenfunctions  of  P ^ Since  the  same  linear  operator 
appears  at  each  iteration,  the  eigendecomposition  need  be  calculated  only 
once.  As  the  range  of  the  projection  remains  unchanged,  an  extra  global 
restriction  on  the  iterates  to  this  span  is  inplicitly  imposed. 

Henceforth  the  algorithm  with  search  direction  given  by  Eq.  (5.4)  will 
be  termed  the  PA  algorithm. 


5.2  The  Gaa66~Nmton  KLQOfUXhm 

Having  introduced  the  geometric  viewpoint  of  algorithms  as  based  upon  set 
approximations,  we  new  look  at  some  of  the  standard  nonlinear  least  squares 
algorithms  in  this  context.  For  solution  of  the  model  problem 


min  F(x)  =  min 


x€lR 


,N 


x€ir' 


N 


M 

i=l 


(5.6) 


a  popular  choice  of  iterative  algorithm  is  the  Gauss-Newton  method  [17] 
in  which  at  the  n-th  iteration  the  search  direction  is  given  by  the 


least  squares  minimum  norm  solution  to  the  linear  system 


(5.7) 


J(x  )2 
n  n 


•f(x  ) 
n 


-M 


where  f(x  )  is  the  vector  in  IR  with  components  f. (x  )  and  J (x  )  is 
■  n  inn 


the  M  X  N  matrix  with  entries 


J .  .  (x  )  =  3 —  f .  (x  ) 
IT  n  dx .  1  n 

: 


(5.8) 


In  the  present  paper  x  =  GeL  OR)  and  f^(x)  =llx-P^xll;  the  gradient  of 


(x)  as  defined  by  Eq.  (4.12)  is 


(Vf^)  (x)  =  (x  -  P^x) /llx  -  P^xll 


(5.9) 


2  M 

so  that  the  Jacobian  ^  is  now  an  operator  from  L  OR)  into  ]R  defined  by 


[^(x)z]^  =  (x  -  P^x,z) /llx  -  P^xll 


(5.10) 


Therefore  z  is  now  the  least  squares,  mirtimum  norm  solution  to  the  underde- 
n 


termined  system,  Eq.  (5.7),  with  replacing  J (x^) ;  such  a  z^  gives  a 

descent  direction  for  F(x). 

We  next  elucidate  the  affine  approximation  implicit  in  this  algorithm. 


Since  P.x  is  the  unique  closest  point  to  x  from  B. ,  it  follows  that  if 
in  n  1 


S^(x^)  is  the  sphere 


S.(x  )  =  {z:llx  -  zll  <  llx  -P.x  11} 
in  n  n  1  n 


(5.11) 


then  B.ns.(x)=P.x,  At  P.x  the  sphere  has  a  well  defined  tangent  plane 
iin  in  in 


T.(x)  ={p.x  +z:(x  -P.x  ,z)  =  ((Vf.)(x),z)=0}  . 


(5.12) 


This  must  also  be  locally  a  supporting  hyperplane  for  at  2 

shows  a  simple  two  dimensional  example.  T^(x^)  can  be  taken  as  an  affine 

approximation  to  at  the  n-th  iteration,  in  which  case  the  least  squares, 

M 

minimum  norm  solution  to  Eq.  (5.4)  is  the  closest  point  in 

x^.  Since  codimension  one,  this  intersection  is  always  nonempty, 

being  in  fact  an  infinite  dimensional  linear  subspace. 

Although  we  bring  this  algorithm  to  the  readers  attention  we  did  not 
implement  it  for  two  reasons.  The  first  is  that  the  subspaces  (x^)  are 
poor  approximations  to  the  sets  B^ ,  containing  no  further  information  than 
that  already  available  in  the  projections  Second  for  our  model 

problems  M<3,  so  that  Eq.  (5.7)  is  of  very  small  dimensions.  Therefore  the 
search  directions  are  not  likely  to  be  very  di^erent  from  those  of  RP 

or  PA. 

5.3  Newton' 6  AlgofUthm 

The  standard  second  order  Newton's  algorithm  for  Eq.  (5.6)  is  based  on 
the  approximation  of  F(z)  near  ^  by  a  truncated  Taylor  series 

expansion 

F(x)  «  F(x  )  +VF(x  )  (x-x  )  +  ^  (x-x  )V^F(x  )  (x-x  )  .  (5.13) 

n  n  n  2  n  n  n 

2 

If  V  F(Xj^)  is  a  positive  definite  matrix,  then  the  approximation  is  minimized 

at  the  point  y  =  z  +  x  ,  where  z  is  the  solution  of 
n  n  n  n 

V^F(x  )z  =  -VF(x  )  .  (5.14) 

n  n 
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In  phase  retrieval  problems  this  linear  system  is  ill  conditioned  and  the 
Hessian  often  has  negative  eigenvalues.  Therefore  to  ensure  that  a  descent 
direction  is  chosen  for  which  Eq.  (5.13)  is  a  good  approximation,  an  eigende- 
composition  of  the  Hessian  must  be  calculated  and  a  filtered  solution 
used.  This  approach  must  be  taken  if  Eq.  (5.14)  is  solved  directly.  However, 
as  we  will  show,  a  geometric  interpretation  allows  the  equation  to  be  rewritten 
in  a  form  in  which  filtering  can  be  performed  without  the  cost  of  an  eigende- 
composition  at  each  iteration. 

2 

We  begin  by  noting  that  the  quantities  Vf(x^)  and  V  F(x^)  are  given 


by 

M 

VF(x  )  =  2  (X  -  P.x  )  (5.15) 

n  n  in 

1=1 


V^F(x  ) 
n 


M 

2  (X  )) 

i=l  ^ 


(5.16) 


2 

where  the  operators  Jf.  (x  )  are  the  Hessians  of  the  functions  f.  (x)  =  llx-P.xll 

in  11 

evaluated  at  x^.  They  are  defined  by 


jfe.  (x  )  :L^aR) 
1  n 


-»-L^aR) 


{x  )y  =  lim 
^  "  a-K) 


P ,  (x  +ay)  -  P .  (x  ) 
in  in 

a 


(5.17) 


Conditions  on  the  set  that  guarantee  existence  and  boundedness  of  the 

operator  are  quite  complicated  [  18  ] .  For  the  purposes  of  this  paper  we 

shall  assume  that  <x^)  exists.  The  only  further  properties  we  require  are 


that  is  symmetric  and  that 


0 


(5.18) 


Jif!  (x  )  (x  -  P .  X  )  = 
in  n  in 


The  Hessian  implicitly  defines  a  first  order  approximation  to  the  pro¬ 
jection  operator  at  x^  by 

P.xRsP.x  (x  )  (x-x  )  (5.19) 

1  1  n  1  n  n 


which  in  turn  defines  an  associated  affine  approximation 

B.  at  P. (x  ) 

1  in 

U.(x)  ={p.x  +z:z€  Range  (x  ) }  .  (5.20) 

in  in  in 


U^(x^)  has  more  structure  than  the  previous  approximation  because 

(x^)  is  not  just  a  projection  operator  with  reuige  ~  ^i^n 

also  a  contraction  (or  expansion)  mapping  about  depending  on  whether 

is  locally  convex  (or  concave)  in  particular  directions  at  Figure  3 

shows  a  two-dimensional  example  of  U^tx^)  for  a  sphere;  (x)  is 

obviously  a  contraction  mapping.  Therefore  at  the  n-th  iteration  we  choose  a 
search  direction  z  =  y  -  x  where  y  minimizes  the  quadratic  approximation 


M 

F(y)  w  T)  Ily-P.x  -Jgr.  (X  )  (y-x  )  II  (5.21) 

1  n  1  n  n 

1=1 


to  F(x).  The  geometric  view  of  y  is  that  it  is  the  "closest  point"  to  the 

n 

collection  of  subspaces  with  the  distance  measured,  not  in  the 

2 

standard  L  metric,  but  as  a  sum  of  new  metrics.  Each  new  metric  reflects 
the  degree  of  curvature  of  the  set  at  therefore  the  accuracy 

of  the  affine  approximation  to  at 


which  in  turn  is  recognizable  as  the  normal  equations  associated  with  the 
block  system 

(I  -ae^  (X  -(X  -  P,x  ) 

In  n  1  n 

:  z  =  :  .  (5.26) 

1/2 

ii-je’ix))  ^  -(x^-p  X  ) 

M  n  n  M  n 

Equation  (5.26)  has  a  geometric  interpretation,  its  solution  is  the  solution 
to  an  approximate  minimization  of  F(x)  similar  to  that  of  Eq.  (5.21)  using 
the  same  affine  approximation  ®i  different  metrics. 

5.4  lit  Conditioning  and  FitteAing 

We  noted  at  the  end  of  Section  4  that  line  searches  could  overcome  some 
simple  cases  of  ill  conditioning.  We  now  consider  more  complicated  cases  and 
ways  to  alleviate  them. 

2 

The  first  comes  from  the  fact  that  L  (cl)  is  not  one-dimensional,  as 
portrayed  in  Fig.  1,  so  that  the  valleys  of  the  surface  T  = { (F (G) ,G) :G £  cl} 
will  not  typically  be  straight  but  rather  will  be  curving  through  space  in 
much  the  same  fashion  as  the  classical  test  case  for  optimization  algorithms, 
the  Rosenbrock  function  [  16  ] .  Experience  has  shown  that  second  order 
methods  significantly  outperform  gradient  methods  in  such  exaiiples,  but  that 
such  features  can  easily  be  sufficiently  ill  conditioned  to  defeat  even  second 
order  algorithms. 

The  second  arises  from  the  existence  of  the  sxjbspace  S  described  in 
Section  3  which  is,  for  practical  purposes,  the  null  space  of  ^ 
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significant  component  of  the  search  direction  lies  within  this  subspace  then 


a  relatively  large  step  can  be  taken  for  a  very  small  decrement  in  F(G). 

This  may  lead  to  "zig-zags"  in  which  steps  are  taken  backwards  and  forwards 

through  S  so  that  the  iterates  do  not  appear  to  converge  and  yet  ^ 

decreasing,  albeit  very  slowly.  The  presence  of  S  also  implies  that  the 
2 

Hessian  V  F (G  )  will  have  very  small  eigenvalues  corresponding  to  directions 
n 

in  S  (particularly  in  Hessians  from  model  problem  I)  so  that  Eq.  (5.14)  is 
difficult  to  solve  numerically. 

These  problems  can  be  overcome  by  calculation  of  the  eigendecomposition 
2 

of  V  F(G^),  filtering  the  eigenvalues  (which  are  real  since  the  Hessian  is 
symmetric)  by  choice  of  a  cutoff  parameter  ^^®  filter 

f  (0)  =  O  ,  if  [ol  >  £3 


(5.27) 


=  0 


otherwise  , 


then  finding  the  minimum  norm  least  squares  solution  of  the  resulting 

filtered  version  of  Eq.  (5.14).  This  scheme  produces  a  search  direction  in 

which  F(G)  should  vary  moderately  rapidly  because  directions  corresponding 

to  small  eigenvalues,  and  thus  slowly  varying  F(G),  have  been  filtered  out. 

We  denote  such  a  filtered  Newtonian  algorithm  by  FN.  The  resulting  direction 

2 

z  is  not  necessarily  a  descent  direction  (V  F(G  )  may  have  large  negative 
n  n 

eigenvalues)  so  we  shall  also  consider  a  variant  of  FN  using  the  filter 


f(a)  =  a 
=  0 


if 

otherwise  . 


The  resulting  algorithm  is  termed  FNP. 


(5.28) 
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The  problem  of  global  ill  conditioning,  in  particular  the  presence  of 
multiple  minima  due  to  possible  nonuniqueness  of  the  phase  retrieval  problem, 
cannot  be  addressed  by  similar  modifications  to  these  undertaken  for  local  ill 
conditioning  simply  because  the  algorithms  are  derived  from  local  analysis. 

5.5  Block  FlltcAylng 

Having  shown  the  need  for  filtering,  we  now  show  that  Eq.  (5.14)  can  be 
solved  by  a  filtered  solution  of  Eq.  (5.26)  in  which  each  block  is  prefiltered 
and  a  standard  least  squares  algorithm  then  applied  (avoiding  the  cost  of  a 
singular  value  decomposition  of  the  block  matrix) .  We  begin  by  noting  that  if 
our  algorithm  is  to  be  efficient  then  we  must  solve 

2 

P  V  F(G  )H  =  -P  VF(G  )  (5.29) 

c  n  c  n 

rather  than  Eq.  (5.14),  so  that  values  only  over  the  interval  cl  are  used. 
(For  this  reason,  Eq.  (5.29)  rather  than  Eq.  (5.14)  is  also  used  in  FN  and 
FNP) .  We  also  note  that  the  Hessians  are  not  complex  valued  linear  operators 
since  it  will  be  obvious  that  for  an  arbitrary  complex  number  a 

(G  )  (aH)  ?<a(3j!f(G  )H)  .  (5.30) 

in  in 

Rather  they  are  linear  operators  on  the  pair  of  real  functions  (Re  H) (w)  and 
dm  H)  (w)  . 

The  ability  to  do  block  prefiltering  will  depend  upon  the  special  form 

of  the  Hessian  in  Eq.  (5.26).  For  the  operators  and  associated  with 

P_  and  P,  it  is  obvious  that  PJIf  In  addition,  simple  calculations 

2  3  c  1  1 


establish 


I 


(Re  H)  {(j) 


(Re  H)  (uj) 


2  n  I 


I  {Im  H) (^) 


if  (Re  G  )  (cj)  >  0 
n 

u)  £  cl 


otherwise 


(5.31) 


3  n 


(Re  H)  (cj) 


dm  H)  (oj) 


G  (dj) 
'  n 


2 

sin  6  -sin  6  cos  6 


-sin  6  cos  6  cos  6 


(Re  H)  (u)) 


(Im  H)  (to) 


where 


6  =  arg  G  (to)  ,  to  €  cl 
n 


(5.32) 


(5.33) 


The  operators  I  can  be  represented  by  2^2  bloclc  diagonal  operators  ; 

at  each  point  u  the  blocks  are 


(1 -Jif_(G„) )  (to)  = 
2  n 


,  if  (Re  G  )  (to)  >  0 
n 


otherwise 


(5.34) 


[I  -JJfdG  )  ]  (to)  = 
3  n 


_  n((0) 

sin  6  cos  9  Tg  ^)  sin  6  -cos  6 

n 


•cos  6  sin  6 


cos  0  sin  6 


(5.35) 


The  spectrum  of  each  operator  I  is  now  seen  to  be  simply  the  union 

over  to  of  the  spectrum  of  each  block.  For  1  this  gives  a  spectnim 
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consisting  only  of  the  points  {o,l},  so  is  its  own  square  root. 

1/2 

Therefore  the  only  prefiltering  that  needs  to  be  done  to  (I  is  to 

eliminate  all  equations  for  (Re  H)  (oj)  that  correspond  to  zeros  in  the  block 

spectra.  For  the  spectrum  will  take  on  a  range  of  values.  Some  of 

these  values  may  be  small  or  negative  if  for  some  U)  the  quantity 

1  /2 

(1  -  n(co)/|G^(w)  |)  is  small  or  negative.  Consequently  (I  either  is 

not  well  defined  (i.e.  has  imaginary  eigenvalues)  or  is  ill  conditioned,  or  is 

1/2 

both.  To  prefilter  the  block  (I  we  therefore  choose  a  parameter 

e j  >  0  and  ona  of  the  following  filters 

« 

f  (o)  =  sgn  0,  if  lo|  >0^ 

=  0  ,  otherwise  (5.36) 


f  (0) 


.1/2 


if  0^03 


=  0 


otherwise 


(5.37) 


The  prefiltered  block  is  then  defined  as 


(I 

o  n  t 


sin  9  cos  0 
-cos  6  sin  6 


0 

1 


sin  6  -cos  0 

cos  0  sin  0 

(5.38) 


It  is  reduced  in  size  by  eliminating  those  equations  for  sin  0  (Re  H)  (ca)  - 
cos  0(lm  H)  (cj)  that  are  associated  with  elements  in  the  spectrum  that  are 
filtered  out. 

From  Eq.  (4.39)  has  the  form 
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cte.  (G  ) 

1  n 


)P^ 

c  c  c  A  n  c  c 


(5.39) 


so  that 


(1-P^){G^)  = 


P  (g  ))P.^ 

c  c  A  n  c  c 


(5.40) 


where  (I-J^(g^))  is  in  block  diagonal  form  over  cl  i 


[I -J(^(gn)]  (v)  = 


sin  ((>  cos  4> 
-cos  4)  sin  <p 


1  - 


m(v) 


sin  $  -cos  <|) 
cos  4>  sin  4> 


(5.41) 


with 


4)  =  arg  g^  (v)  , 


V  6  cl 


(5.42) 


Unfortuna^ly  I  -  does  not  possess  an  obvious  symmetric  square  root. 

However  an  asymmetric  root  can  be  found  by  noting  that 


(I  -  P^  (G  )  )H  =  G  -  P  P,G 
c  T.  n  n  c  1  n 


(5.43) 


is  the  normal  equation  associated  with  the  least  square  problem 


min  ll(I-jK(g  ) )  H-P  P,P  ^  G  11^ 

A  n  cc  cAccn 

H 


(5.44) 


1/2 

so  that  we  may  take  for  the  block  (I  -  the  composite  operator 

1/2 

(I prefilter  each  component.  The  filtered  component 

1/2  1/2 
(I-J*^)^  is  calculated  in  the  same  fashion  as  (I -J^)  ^  ;  the  component 

P requires  further  manipulations  before  prefiltering.  Let  P  jJt*  have 
CO  c  c 

an  eigendecomposition  UlU^ ,  then  as  U  is  a  unitary  operator  the  solution 


to  a  prefiltered  Eq.  (5.26)  is  given  by  H^  =  UL^,  where  is  the  least 
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squares  solution  to 


(I  -Ji^(g  ))yV 

P  P  G 

A  n  f 

L  = 

c  A  c  c  n 

(I  (G  ))y^V 

inf 

P.G  -  G 
in  n 

where  L  may  be  filtered  using  Eqs.  (5.36)  or  (5.37). 

The  reason  for  choosing  to  solve  for  L  instead  of  H  lies  in  the 

n  n 

choice  of  the  pointwise  discretization  used  in  numerical  calculations  and  in 
the  form  of  the  IMSL  subroutine  LLSQF  used  to  solve  the  discretized  Eq.  (5.45). 
This  subroutine  uses  an  adaptive  QR  Algorithm,  at  each  stage  a  column  is 
chosen  from  the  block  matrix  euid  incorporated  in  the  QR  factorization 

calculated  at  the  previous  stage.  A  running  estimate  is  kept  on  the  condition 

% 

number  of  R,  if  this  estimate  exceeds  a  use  defined  parameter,  ATOL,  then  the 
routine  halts  and  computes  a  minimum  norm  least  squares  solution  using  the 
most  recent  QR  factorization  and  also  sets  components  in  L  corresponding  to 
unused  columns  to  zero.  For  a  solution  H  this  corresponds  to  setting  point 
values  of  H(a))  to  zero;  for  L  it  is  setting  components  of  L  in  an  eigen- 
fianction  expansion  to  zero.  Our  analysis  of  ill  conditioning  indicates  that 
such  coit^)onents  are  likely  to  be  zero  amyway,  so  we  believe  Eq.  (5.45)  to  be 
the  preferred  form  for  solution. 

We  shall  term  the  algorithm  based  on  Eq.  (5.45)  with  the  filter  given  by 
Eq.  (5.24)  the  SQFN  algorithm,  and  that  based  on  Eq.  (5.45)  with  the  filter 
given  by  Eq.  (5.37)  the  SQFNP  algorithm.  SQFN  incorporates  more  information 
c±>out  the  Hessian  at  each  step,  but  SQFNP  gives  a  guaranteed  descent  direction 
H  at  each  step. 


152 


6. 


NUMERICAL  RESULTS 


We  now  present  results  for  the  solution  of  the  discretized  model 

problems  using  the  various  algorithms  so  far  proposed  and  show  the  discrete 

approximate  solutions  G  generated.  The  main  measure  of  performance  used  in 

this  section  will  be  the  number  of  iterations  n  required  by  the  algorithm 

to  reduce  the  function  below  a  prescribed  value.  Since  we  are 

interested  more  in  the  comparative  than  absolute  behaviour  of  the  algorithms 

we  tcGce  as  the  benchmark  case  the  RP  algorithm  and  consider  the  quantity 

n/n  where  n  is  the  number  of  iterations  required  by  RP  to  reduce  F(G-) 

n 

to  below  a  prescribed  value. 

The  ill  conditioned  nature  of  the  problem  means  that  this  is  not  the 
best  of  measures.  Difficulties  are  encountered  through  behaviour  such  as  the 
erratic  decrease  of  often  a  succession  of  iterates  produce  little 

change  in  end  then  a  sudden  decrease  is  recorded.  In  some  algorithms 

the  iterates  may  converge  to  a  particular  value  6  whereas  in  others  they 

/V  X 

fail  to  converge  so  that  end  P(G)  can  be  considerably  different. 

For  these  reasons  the  results  reported  here  are  intended  as  a  guide  to  the 
behaviour  of  algorithms  on  real  problems  rather  than  a  prediction.  However, 
poor  though  the  measure  is,  there  is  none  other  available  for  use  that  does 
not  require  prior  knowledge  of  the  solution. 

6.1  V^cAeZization,  Tut  Functiom  and  AZgofUtlim 

The  choice  of  discretization  in  these  calculations  was  determined  by 
the  condition  that  it  give  am  accurate,  efficient  approximation  to  the 


finite  Fourier  transform  and  that  the  discretized  projections  be  easily 
calculated.  Therefore  a  discretization  based  on  N  point  Gaussian 
quadrature  was  chosen;  the  previous  paper  shows  its  accuracy  and  efficiency 
and  its  pointwise  nature  allows  easy  evaluation  of  projections.  The  finite 
dimensional  problem  now  involves  vectors  g,  GtC  whose  elements  g  ,  G 

iv  iC 

are  to  be  approximations  to  the  function  values  g(P.  )  >  G(p  )  at  the 

JC  iv 

abscissae  of  the  N  point  Gaussian  quadrature  rule  on  cl.  Vectors 

^  ^  N 

m,  n  £]R  are  formed  with  components  mj^  =  m(Pj^),  nj^  =  n(pj^)  where  m(v)  and 
n(w)  are  the  known  moduli.  Matrices  W,  FtC  are  constructed  with  W 
being  a  diagonal  matrix  whose  k-th  entry  is  the  weight  of  the  quadrature 


rule  and  with  F  having  entries  Fj^^  =  £ 
the  discretized  model  problems 


.  This  construction  gives  as 


Find  vectors  g  and  G  such  that  g*FWG  and 

I:  I9J  = 

II:  \\\  = 

III:  |g,^l  = 

where  th  and  n  are  specified. 


Since  the  metric  for  all  cuialysis  so  far  has  been 


the  numerical 


calculations  were  done  on  a  rescaled  version  of  the  equation  g  =  FG, 


(6.1) 


'^1/2'-  ^1/2'' 

With  this  rescaling  the  Euclidean  norms  of  the  vectors  W  G  and  W  g 
are  now  good  approximations  to  the  norms  HgII^  and 

is  a  good  approximation  to  ^ ^  operator  norm  induced  by  II  *112 • 
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The  two  test  functions  chosen  for  the  model  problems  were 


G^(o.)  = 


—  TTi  P 


G  U)  =  e 


(6.2) 


where  P^(^i)  is  the  i-th  monic  Legendre  polynomial.  The  Fourier  transform 
g^{v)  of  G^(oj)  is 


1  .  ^ 

(2Trv-i) 


(6.3) 


g  (v)  has  no  closed  form  and  was  calculated  numerically.  Both  functions  were 

used^  as  test  cases  for  model  problem  I ,  G^  and  as  a  test  case  for 

2  2 

problem  II  and  G  and  g  as  a  test  case  for  problem  III.  The  discrete 

i  i  <^1 

approximations  to  G  (w)  and  g  (v)  are  denoted  by  G  and  g  . 

All  numerical  tests  used  the  following  basic  algorithm 

1:  Choose  an  initial  guess  Gq 

2;  Given  iterates  g  and  G  compute  a  search  direction  H 

^n  n  ^  n 

3:  Calculate  a  step  length  and  new  iterates 


G  ^  =  G  +X  H  , 

n+1  n  n  n 


g  .,  =  FWG 


n+1  n+1 

4:  Iterate  steps  2  and  3  until  the  convergence  criteria  are  satisfied. 
The  convergence  criteria  used  in  all  calculations  were 


F(G)<10"  or  y  IIX  ,H  ,11  <  2  X  lo” 
n  ,*-<  n-k  n-k 

k=0 


(6.4) 


together  with  an  upper  limit  on  the  number  of  iterations.  The  sum  of 
the  last  three  steplengths,  rather  thcin  alone,  was  chosen  as  in  ill 
conditioned  minimization  problems  "stop-start"  behaviour  is  often  noticed. 
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That  is  a  large  step  often  followed  by  one  or  two  small  steps  after  which 


another  large  step  is  taken.  This  feature  was  often  observed  in  the  use  of 
any  second  order  algorithm,  differences  in  magnitude  of  successive  step 
lengths  by  factors  greater  than  100  occurred  fairly  frequently. 

Three  different  forms  of  initial  guess  were  used 


1. 


2. 

3. 


^  0 

=(l-e^) (l+e^r  )gJ 


Guess  2  is  a  damped  perturbation  of  the  true  solution  G  with  repre¬ 

senting  the  noise  level.  For  convenience  we  shall  express  this  level  as  a 
percentage,  e.g.  £^=.2  will  be  expressed  as  £^  =  20%  noise.  The  variable 
rj^  is  a  random  complex  variable  with  modulus  uniformly  distributed  over 
[0,1]  and  phase  uniformly  distributed  over  [0,2itJ.  Guess  3  represents  a 
small  totally  random  perturbation  about  the  origin,  again  for  convenience  such 
guesses  shall  be  denoted  by  £^=100%. 


6.2  Ckoicz  0|5  Stzp  Lzngttu 

The  first  results  reported  are  those  on  determination  of  an  optimal 
choice  of  Three  possibilities  were  considered 


2 

2.  X  =  r  where  r  is  a  random  variable  uniformly  distributed 
n  n  n 


3.  X  is  the  approximate  minimum  of  F(G  +A  H  )  as  a  function  of  'X 
n  n  n  n 

determined  by  Powell's  cubic  line  search  algorithm  [  16  ]  with  the 
following  convergence  criteria  on  the  iterates  j^X^ 


F(G  +  ,  X^H  ) 

k  n  k-1  n 

n  k  n  n 

,X^ 

F(G  ) 

k  n 

n 

<  .02 


Vf(g  +,X^h  ) 

F(G  +  ,  X^H  ) 

n  k  n  n 

n  k  n  n 

VF(G  ) 

F{G  ) 

n 

n 

<  .02 


(6.5) 


k  <5 


The  performances  of  X^  were  compared  by  running  each  possibility  on  each 
model  problem  with  the  appropriate  test  functions  using  the  RP  algorithm. 

The  parameters  c  =  2,  N  =  40  and  were  chosen  and  each  problem  was 

started  with  three  different  initial  guesses  with  £^  =  20%,  60%  and  100%. 

The  results  were  remarkably  uniform  over  all  test  cases  of  model  problem, 

1  2 

test  function  and  initial  guess.  Choices  X^  and  X^  performed  almost 
identically  with  X^  just  under  a  factor  of  2  better.  Almost  always  only  one 
extra  fimction  evaluation  was  required  for  X^,  i.e.  the  convergence  criteria 
of  Equation  (6.5)  were  satisfied  at  k  =  1.  This  extra  computation  almost 
exactly  balances  the  savings  in  the  reduced  number  of  iterations  so  that  all 
three  choices  incurred  the  same  computational  cost  in  reduction  of  F(G^)  to 
a  specified  value.  However  the  greater  flexibility  of  X^  led  to  its 
adoption  in  all  subsequent  calculations. 

The  convexity  of  the  set  cl  implies  that  F(G  +Xh  )  <F(G  )  for  all 

n  n  n 


X6  (0,2),  the  computations  showed  that  this  inequality  still  frequently  held 


for  A€  (2,3)  but  almost  always  failed  beyond  this.  Almost  all  the  values  of 


lay  in  the  interval  [1,2.5]  so  in  all  sxabsequent  line  searches  an  initial 
-  3 

guess  of  =  4  was  used.  As  an  experiment  further  runs  were  made  with  the 


fixed  choice  ^  these  compared  very  poorly  with  A^. 


6.3  InltLaZ  Gue^-iei  and  ILL  Conditioning 

We  now  present  some  results  on  the  ill  conditioning  of  the  model 
problems.  The  first  set  are  from  efforts  to  estimate  local  ill  conditioning 
through  the  effects  of  small  perturbations  of  the  data  m(v).  They  were 
obtained  from  model  problems  I-III  with  the  appropriate  test  functions,  para¬ 
meters  c  =  2 ,  N  =  40  and  N  =40  and  using  the  RP  algorithm.  Random 
relative  errors  of  size  £2  were  induced  in  the  vector  m  representing  the 
data,  then  the  algorithm  was  started  at  the  true  solution  of  the  un¬ 

perturbed  problem.  Table  1  gives  some  typical  results  for  two  perturbations 
£2  =  1%  and  £2  =  5%.  The  data  appearing  is  the  percentage  relative  error  in 
G  (100.  IIg  -  G^II/IIg^II)  and  the  value  P{G)  where  G  is  the  approximate 
solution  to  the  perturbed  problem  reached  by  the  algorithm. 

The  data  indicate  that  the  problems  are  locally  well  conditioned  in  that 

there  exists  a  possible  nearby  solution  to  the  perturbed  problem  to  which  the 

iterates  tended  and  whose  distance  from  the  solution  of  the  unperturbed 

problem  is  of  the  same  order  as  the  perturbation.  We  cannot  assert  that  the 

solution  does  exist  for  in  almost  all  cases  the  algorithm  terminated  with 

F{G  )  <  10  ^  or  n>N  .  Termination  due  to  convergence  of  successive 
n  max 
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iterates  did  not  occur  although  they  always  appeared  to  be  circling  about  some 
common  point.  This  problem  was  encountered  throughout  these  test  computations, 
however  we  delay  further  discussion  until  the  presentation  of  results  on 
fi Itering. 

The  second  set  of  results  indicates  the  global  ill  conditioning  of  the 
problem,  particularly  with  respect  to  the  natural  measure  F(G).  Figures  4-6 

represent  three  functions  G  found  as  solutions  to  model  problem  I  with  test 

XT  X  ,  -5  -5  -4 

function  G  with  values  F(G)  of  8  10  ,  7  x  lo  and  8  x  lo  , 

X 

respectively.  Figures  7-9  show  the  G's  obtained  by  repeating  the  runs  for 
model  problem  III  holding  all  other  factors  (algorithm,  initial  guess,  etc.) 

X  -2  -2  -2 

constant.  The  values  F(G)  were  2X10  ,  3  10  and  7  x  lo  ,  respectively. 

Although  all  functions  G  resemble  G  to  some  degree  the  wide  variance 
among  them  suggests  that  the  surface  T  =  { (F (G) ,G) :G  €  cl}  is  very  rugged. 

Furthermore  although  the  functions  in  Figures  7-9  appear  graphically  to  be 

a2  '' 

closer  to  G  than  those  in  Figures  4-6  the  function  F(G)  holds  them  to  be 

substeintially  further  away. 

In  general  the  solutions  G  to  model  problems  II  and  III  were,  as 
suggested  by  these  graphs,  definitely  more  acceptable  as  solutions  to  the 

phase  retrieval  problem  than  solutions  to  model  problem  I;  nevertheless  the 

^  3 

values  F(G)  for  II  and  III  were  almost  always  a  factor  of  10  to  10  greater 

than  those  of  I.  A  tentative  ranking  of  the  problems  in  order  of  decreasing 

''2 

ill  conditioning  would  be  I  (with  G  )  > II  >  I  (with  G  )  > III.  As  expected 

more  information  gives  better  solutions,  however  we  were  not  sure  to  what 

^1 

degree  the  particular  choice  of  G  with  its  large  discontinuity  at  ciJ  =  c 


influenced  this  ordering. 

We  conclude  this  subsection  with  a  brief  mention  of  an  anomaly  that  lead 
to  use  of  initial  guess  3  rather  than  guess  1.  As  noted  in  a  previous  paper 
[19]  it  is  possible  to  show  that  if  iterates  possess  certain  symmetries  then 
almost  all  the  algorithms  preserve  these  symmetries.  In  particular  with  the 
highly  symmetric  choice  of  0^  =  0  some  form  of  symmetry  was  always  preserved 
yet  the  algorithm  often  converged  to  reasonable  pseudo-solutions .  One  such 
example  for  model  problem  I  with  N  =  40  and  c=2  is  shown  in  Figure  10, 

X  X  _5 

it  displays  a  symmetric  solution  G  for  which  F(G)  <5x10  .  Therefore,  if 

/\ 

no  a  priori  knowledge  is  available  for  G,  a  small  random  perturbation  is  a 
better  initial  guess  than  zero. 


6.4  ?iJU.(Ln.e.d  PR  and  PA  klQoaiXhm 

Having  determined  the  problem  to  be  globally  ill  conditioned  we  sought  to 
produce  a  better  conditioned  global  problem  by  filtering  out  the  local  ill 
conditioning  of  the  finite  Fourier  transform  discussed  previously.  This  was 
done  by  projecting  the  search  directions  (and  therefore  the  iterates  G^) 

on  to  the  span  of  the  first  P  eigenfunctions  of  ^ ^  whose  associated 
eigenvalues  had  magnitude  greater  than  a  prescribed  cutoff  e^.  As  a  test 
this  filter  was  applied  to  iterates  in  the  RP  algorithm  producing  the  filtered 
restricted  projection  algorithm  (FRP)  and  the  algorithms  compared  on  all  model 
problems  with  parameters  c  =  2,  N  =  40  and  choices  of  cutoff 

£^=•9  and  £2=. 25  were  considered, these  values  gave  projections  on  to 
subspaces  of  dimensions  15  and  18,  respectively. 
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As  measured  by  F (G)  FRP  performed  very  poorly  compared  to  RP  for 

/s. 

e, =  .9.  At  best  1.5  times  as  many  iterations  were  needed  to  reduce  F (G  )  to 
3  n 

comparable  values;  very  often  FRP  failed  in  that  it  ran  the  full  N 
^  max 

^  95  X  X 

iterations  reaching  a  point  G  such  that  F(G)  >100F(G),  where  G  was  the 
final  iterate  of  RP  under  the  same  conditions.  FRP  substantially  improved 
with  £2=  .25,  now  averaging  1.5  times  as  many  iterations,  but  still  occasionally 
failing  in  the  manner  described  above.  However,  if  meaisured  by  the  distance 

^  X 

between  final  iterates  the  algorithms  appeared  to  be  more  equal,  IIg  -  gH  was 

usually  less  than  .25  for  all  algorithms. 

The  disappointing  result  was  that  FRP  did  not  display  improved  conver- 

2 

gence  properties  over  RP  with  regard  to  the  size  of  I,  UX  ,1.  Iterates 

k»0  n-k  n-k 

still  appeared  to  wander  through  the  subspace  without  much  effect  on  F(G^). 

It  seems  that  a  fifteen  dimensional  subspace  is  still  too  large  and  quite 

severe  restrictions  (P~5)  must  be  used  to  ensure  convergence  to  a  minimum. 

These  results  were  borne  out  by  the  results  and  5>r!i  arison  of  the  PA  and 

RP  algorithms  over  all  model  problems  for  parameter  lairs  ,Ni  of  (2,40)  , 

(2,32)  and  (1.5,22)  euid  N  *40.  A  variety  of  filters  were  used  but  even 

mdx 

for  the  best,  using  the  cutoff  of  Equation  (6.6),  the  results  nad  a  large 

variance  and  averaged  out  so  that  PA  still  required  about  the  same  number  of 

iterations  to  achieve  the  same  improvement  in  F(G^).  Again  termination  was 

due  to  reduction  of  F(G)  or  n^N  and  not  to  convergence  of  successive 

max 

iterates.  One  reason  for  this  is  that  for  the  parameter  values  c  under 

consideration  almost  all  significantly  nonzero  eigenvalues  of  P  JTP  have 

c  c 

magnitude  ~1  so  that  the  filtered  inverse  ^^c^^c^f^ 


and  the  Hermitian 


6.5  Relcutcva  PeA^oAmancc  Second  OfideA  Alqo^'Uthms 

We  now  present  results  on  the  relative  performance  of  algorithms  FN, 

FNP,  SQFN  and  SQFNP  compared  to  RP.  They  are  not  comprehensive;  considera¬ 
tions  such  as  the  number  of  parameters  at  the  users  disposal,  failure  of  the 
IMSL  subroutines  on  certain  equations  and  computational  cost  prevented  this. 
The  algorithms  were  run  on  each  model  problem  with  the  parameter  pairs  (c,N) 

being  (2,40),  (2,32)  and  (1.5,22),  N  ranging  from  15  (for  computa-  '' 

IHdX 

tionally  costly  algorithms)  to  50  and  with  perturbations  in. the  initial  guess 
of  e^  =  20,  60  and  100.  The  cutoff  parameters  in  Equations  (5.26), 

(5.27),  (5.34)  and  (5.35)  used  at  the  n-th  iteration,  were  calculated  from 
quantity 

p  =  max{min{4 . 2llA  ,H  ,11  -  .024,  .2},  .002}  .  (6.6) 

n- 1  n- 1 

For  FN  amd  FNP  €^  =  0,  for  SQFN  and  SQFNP  e^  =  p/3.  The  IMSL  subroutine 

_2  ^ 

EIGRS  was  used  to  calculate  the  eigendecomposition  of  V  F(G^),  the  subroutine 
LLSQF  to  calculate  the  minimum  norm  least  squares  solution  of  Equation  (5.42), 
with  the  upper  boxind  ATOL  on  the  condition  number  of  R  in  the  QR  factoriza¬ 
tion  specified  to  be  e^/lO. 
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The  results  in  Tables  2-5  aive  first  the  averaae  oerformance  of  RP  on 
each  problem  in  terms  of  the  final  value  F(G)  and  the  number  of  iterates  n 
needed  to  reach  this.  The  remaining  entries  are  the  ratios  n/n  where  h  is 
the  average  number  of  iterations  needed  by  the  alternative  algorithm  to  reduce 
F(G-)  to  below  F (G) .  In  some  cases  the  algorithm  on  trial  failed  to  reduce 

/s  > 

F(G-)  to  below  lOOF(G),  such  cases  are  marked  by  a  .  In  other  cases  the 
n 

Ak  as, 

alternative  algorithm  converged  but  with  larger  than  ^  in  which 

case  the  n  of  the  ratio  was  altered  to  the  number  of  iterations  required  by 

RP  to  reduce  F{G~)  to  below  F(G-). 

n  n 

The  remaining  tables  show  the  effect  of  (euid  thus  the  need  for)  filtering. 
Table  6  contains  pairs  giving  the  range  of  the  number  of  eigenvalues  preserved 
at  each  iteration  by  the  filter  in  FN  and  FNP.  The  final  table  for  5QFH 
and  SQFNP  shows  two  pairs,  the  upper  giving  the  range  of  the  number  of  rows 
in  the  filtered  block  equation  (5.40)  ,  the  lower  pair  indicates  the  range  of 
the  number  of  columns  used  by  LLSQF.  Typically  the  larger  figures  in  the 

^  As 

range  occur  as  decreases  and  G^  converges,  however  the  "stop-start" 

behaviour  of  the  iterates  often  caused  these  measures  of  filtering  to  also 
jump  about. 

Finally  we  present  graphs  (Figures  11-22)  of  final  iterates  for  RP  with 
parameters  c  =  2  and  N  =  40  for  each  model  problem  and  e^  =  20%,  60%  and 
100%.  As  noted  previously  final  iterates  varied  much  less  than  final  values 
of  ^he  graphs  are  typical  of  all  final  iterates  for  these  problems. 

Some  points  worth  noting  on  observed  algorithm  behaviour  are  given.  RP 

As 

hardly  ever  halted  due  to  convergence  of  the  sequence  G^,  iterates  exhibited 


steady  "zig-zagging"  with  accompanying  small  but  steady  decreases  in 

F(G  )  after  an  initial  burst  of  reduction  in  the  first  10  or  so  iterates, 
n 

The  higher  order  algorithms  did  give  convergent  iterates  on  a  nvimber  of 

/S  /s. 

occasions  but  some  limit  points  were  not  true  minima  since  at  G  VF (G  )  0 

n  n 

but  VF(G^)  lay  in  the  null  space  of  the  filtered  Hessian. 

The  cutoff  parameter  given  by  Equation  (6.6)  was  derived  after  some 
trials  and  errors  and  is  certainly  problem  specific.  However  it  performed 
satisfactorily  for  FN,  FNP,  SQFN  and  SQFNP  in  that  it  generated  that 

almost  always  lay  in  the  interval  [.1,1.5],  and  for  FN  and  FNP  started  close 
to  the  true  solution  the  expected  value  of  was  always  observed. 

Moreover  it  was  noted  that  RP  gave  the  best  initial  decreases  in  when 

Started  with  =  100%,  so  in  all  algorithms  the  final  search  direction  H^ 

*  '^l 

was  taken  to  be  a  weighted  linear  combination  of  the  H^  determined  by  the 

^2 

particular  algorithm  euid  the  of  RP,  i.e. 


(6.7) 
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7. 


CONCLUSIONS 


The  ill  posed  nature  of  phase  retrieval  induces  enough  variance  in  the 
data  to  prevent  the  drawing  of  quantitative  conclusions,  however  qualitative 
results  are  apparent.  As  expected  FN  and  FNP  show  the  best  performance  when 
iterates  are  close  to  the  true  solution  ,  with  FNP  displaying  greater 
robustness  in  regions  further  away.  Close  to  the  solution  SQFN  and  SQFNP 
do  not  improve  upon  RP ,  but  they  give  the  best  results  when  started  from  a 
random  initial  guess.  Since  SQFN,  SQFNP,  FN  and  FNP  are  all  filtered  variants 
of  the  same  basic  algorithm  it  should  be  possible  by  a  judicious  choice  of 
filters  to  combine  the  best  features  of  all  in  one  algorithm;  to  do  so 
requires  further  study  of  the  block  filterings  of  subsection  5.5. 

As  measured  in  terms  of  the  computational  cost  to  reduce  ® 

certain  value  RP  appears  to  be  the  certain  winner  and  it  is  difficult  to  see 
how  any  other  algorithm  can  be  improved  enough  to  compete  with  it.  Of  the 
second  order  algorithms  SQFNP  seems  to  be  the  most  efficient,  requiring  a 
matrix  with  .8  the  number  of  rows  as  SQFN  to  give  approximately  the  same 
results.  Furthermore  the  cost  can  probably  be  substantially  reduced  by 
taking  advantage  of  the  sparsity  and  special  structure  of  the  blocks  in 
Equation  (5.42). 

Tables  6  and  7  confirm  that  the  essential  dimension  of  model  problem  I  is 
twice  the  number  P  of  significantly  nonzero  eigenvalues  of 
essential  dimension  of  II  and  III  is  harder  to  estimate  although  a  reasonable 
upper  boxind  would  be  2P  +  n.  Certainly  for  N  =  22  auid  32  the  problems 
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appear  to  be  almost  well  posed.  The  tables  also  indicate  that  II  may  be  more 
(locally)  ill  conditioned  than  III  although  this  may  be  attributable  to  the 
form  of  the  test  functions. 

The  oscillatory  nature  of  the  graphed  solutions  suggests  that  further 
filtering  is  necessary  and  that  local  filtering  to  the  degree  done  here  is 
insufficient  to  give  a  well  posed  problem.  Three  options  for  further 
conditioning  are: 

1.  Use  of  a  more  restrictive  local  filter 

2.  Regularization  of  F(G)  by  addition  of  a  penalty  function  e.g. , 

F^(G)  -  r(G)  .afgl'  (7.1) 

3.  Use  of  the  algorithm  above  followed  by  smoothing  of  the  numerical 
solutions  G*. 

Each  option  has  its  attractions:  with  use  of  1  the  iterates  would  be 
restricted  to  the  span  of  the  first  5  to  10  singular  functions,  these  are 
known  to  be  smooth  eind  slowly  varying  so  a  linear  combination  would  also  be 
reasonably  well  behaved.  Note  that  this  also  appears  if  the  interval  size 
c  is  reduced  to  around  .75  or  smaller,  thus  having  less  information  may 
actually  give  a  better  solution.  For  2  a  suitably  chosen  penalty  function 
can  easily  be  accommodated  within  the  theory  of  algorithms  based  on  projections 
developed  here.  Finally  visual  inspection  of  the  graphs  shewn  indicates  that 
the  functions  G  are  often  very  perturbed  but  are  basically  similar  to  the 

true  solution  G  so  that  option  3  will  often  give  2ui  acceptable  final 


solution. 


APPENDIX  A.  ON  THE  EXISTENCE  OF  MULTIPLE  SOLUTIONS  TO  THE  2-D  PHASE 
RECOVERY  PROBLEM 

The  two-dimensional  phase  retrieval  problem  can  be  stated  as: 

2 

Let  A  arid  B  be  bounded  subsets  of  ]R  .  Given  the 
information  that  giz^,z^)  is  the  Fourier  transform  of  a  function 
G{(A)^,a)2)  with  support  contained  in  B  and  the  values  mCx^jX^)  = 
lg{x^,X2)l  on  A,  find  the  phase  of  g  on  A  euid  reconstruct  G. 

The  phase  retrieval  problem  does  not  necessarily  have  a  unique  solution. 
The  aim  of  this  appendix  is  to  derive  from  a  given  solution  pair  g  and  G 
ne.ce^-i(VLy  conditions  for  the  existence  (and  characterization)  of  alternative 
solution  pairs  h  and  H.  An  intuitive  start  for  such  an  investigation  is 
the  simple  result  that  if  f(z)  is  an  analytic  function  of  the  complex 
variable  z  then  so  is  f*(z*)  and  f(z)  and  f*(z*)  have  the  same  modulus 
for  real  z.  This  suggests  that  if  a  solution  g  can  be  factored  into  a 
product  of  analytic  functions  g^  and  g^  then  the  entire  function  of  two 
complex  variables  h(z^,Z2)  ^  is  also  a  possible  solution. 

The  main  result  of  this  appendix  is  to  show  that  all  possible  alternative 
solutions  must  be  of  this  simple  form. 

A.l  Om-V>um¥iitionaZ  RuuLU 

The  following  results  from  the  theory  of  functions  of  a  single  complex 
variedsle  are  required. 

The.OA.e.m  A.l:  PaJie.y-W^e.neA.  ThzoACrr  [20].  Let  B=  be  a  bounded  inter- 

2 

val  in  ]R.  Then  for  any  G€l  OR)  »  G^O  on  B,  the  transform 
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(A.l) 


The  next  result  is  the  fundamental  theorem  providing  the  necessary 
machinery  to  characterize  all  possible  solutions  to  the  phase  problem  both  in 
one  and  two  dimensions.  Although  independently  derived  by  many  authors  [3,21], 
it  appears  to  have  been  first  stated  by  Akutowicz  I8.9j.  We  state  the 
result  as  originally  presented  there. 

2 

TheoAem  A.  2.  Let  ^  be  the  class  of  all  functions  g€L  (R)  satisfying 

1.  I g (x)  I  =  m (x)  ^  0  Vx£JR 

2.  g  =  where  support  of  G  is  contained  in  a  bounded  interval 

B  of  R. 

Then  any  two  functions  g,  h€^  are  related  by  equations  of  the  form 


h(z) 

B(z) 


i  (a+Bz) 
e 


B(z)g(z) 


(A. 3) 


where  the  Zj^  form  some  subset  of  the  zeros  of  g(z)  .  The  function 
(z-zj)  / (z-Zj^)  is  termed  a  Blaschke  factor. 
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Lznvna  A. I :  A  necessary  and  sufficient  condition  for  the  infinite  product 
B(z)  to  converge  is  that  [9] 


£=1  l+lzj-l 


A  sufficient  condition  for  the  convergence  of  the  infinite  sum  is  that  G(u) 
have  only  a  finite  number  of  jump  discontinuities  over  B. 

It  is  easily  shown  that  if  G  has  support  in  an  interval  B  and  if 
h(z)  =B(z)g{z),  then  H(u)  =  ^h)  (u)  also  has  support  in  B.  Therefore 
combining  theorems  A.l,  A. 2  and  lemma  A.l  gives  the  following  statement  on 
existence  of  multiple  solutions  to  the  1-D  phase  retrieval  problem. 


Tlie.OfLe.in  A.3<*  Let  a  and  B  be  bounded  intervals  in  ]R  with  a  modulus 
m(x)  specified  over  A  and  a  solution  pair  g  and  G  be  given  to  the 
corresponding  1-D  phase  problem.  Then  if  m(x)  has  an  extension  to  the 
entire  real  line  such  that  m(x)  >0  and  G(u)  has  only  finite  jump  dis¬ 
continuities  over  B  as  well  as  being  nonzero  in  neighborhoods  of  the  end¬ 
points  of  B,  then  all  other  solution  pairs  h,H  are  given  by 

h(z)  =  e^°^B(z)g(z)  (A. 5) 

H(u)  =  (^“^h)  (u)  (A. 6) 

where  B(z)  is  any  finite  or  infinite  product  of  Blaschke  factors  and  a€]R. 

The  conditions  of  Theorem  A. 3  imply  that  a  solution  g(z)  has  a 
Hadamard  factorization 


g(z)  =  lg(0) [e 


i (a+6z) 


n  (i  -  —  ) 

1=1  '  I 


where  a,6€]R,  which  may  be  rewritten  as 


g(z)  = 


lg(0)  je 


i (a+Bz) 


n  (i  -  n  (i  -  — ) 

zeA  '  fil  (n-a  )  '  r 


n 

le  (n-A) 


=  g^{z)g2{z) 


(A. 7) 


(A. 8) 

where  A  is  a  subset  of  the  natural  numbers  N.  Theorem  A. 3  thus  states  that 
any  solution  h(z)  has  the  form 


h{z)  =  e^^  g^(z)g*{z*),  y 


(A. 9) 


i.e.  that  all  possible  solutions  are  in  one  to  one  correspondence  with  all 
possible  factorizations  of  g(z). 


A.  2  ExX-ZrUt-Lon  to  Two  VXmziuXoM 

Conditions  that  multiple  solutions  to  the  2-D  phase  retrieval  problem 
must  satisfy  can  be  deduced  from  the  l-D  results.  To  begin,  suppose  that 
the  problem  as  stated  has  a  solution  pair  g(z^,Z2),  where 

z  =  X +  iy  and  oo  =  u +  iv  denote  variables  in  the  transform  and  physical 
domains,  such  that  G(u^,U2)  has  only  a  finite  number  of  jump  discontinuities 
over  B.  Then 

Lemma  A.  2:  g(z^,Z2)  is  an  entire  function  of  the  complex  variables 


of  exponential  growth. 


Paoo^. 


After  defining  the  quantities 


=  max{u^:  (u^.u^)  C  b}  ,  =  max{u2  :  (u^  .u^)  €  b} 

=  min{u^  :  (u^  ,0^)  €  b}  ,  =  min{u2  :  (u^  jU^)  €  b} 


can  be  expressed  as 


-1  iz^u^  -2 

g(Zl.22^  =  I  dUi  e  I 


u^(u^) 


^^2'"2 

du2  e  G(u^,U2) 


U2(Ui) 


(A. 10) 


Writing  g(z  ,z  )  as  g  (z  )  to  indicate  that  ‘g(z  ,z  )  is  to  be  considered 
X  ^  Z  2  X  X  ^ 

as  a  function  of  z^  only  with  Z2  fixed  gives 


g  (z 
^2 


1)  =  J_  G(u^,Z2)e  du^ 


(A. 11) 


Therefore  by  Theorem  A.l  g  (z  )  is  an  entire  function  of  z.  of  exponential 

Z2  1  1 

growth.  A  similar  procedure  shows  that  g  (z_)  is  an  entire  function  of  z_. 

^1 

An  immediate  consequence  of  this  lemma  is  that  if  a  solution  exists  then 

2 

the  modulus  m(x^,X2)  has  an  analytic  extension  to  all  of  m  ,  which,  under 
the  assumption  that  m(x^,X2)  >0  V(x^,X2),  is  unique. 

Now  let  h(z^,Z2),  11(01^,0)2)  be  any  other  solution  pair  to  this  problem; 
then  h  (x, )  and  g  (x,)  must  have  the  same  modulus  m  (x,)  over  the 

X2  1  ^X2  1  X2  1 

set  B  =  {x, : (x, ,x_)  €  b} ;  i.e.  h  (z, )  and  g  (z, )  are  both  solutions 

X2  1  1  2  ^^2  ^  ^ 

to  ?.  l-D  phase  retrieval  problem.  Therefore  by  Theorem  A. 2,  Lenana  A.l  and 
the  above  assumptions  on  g,  G  and  m  we  have  that 


ia(x  )  i6{x  )z 

h  (z.)  =  e  e  (z,)g  (z, )  (A. 12) 

where  ai-x.^)  and  6(x2)  are  constants  dependent  on  X2  only  and  B(z)  is 
an  infinite  product  of  Blashke  factors  formed  from  the  zeros 
g  (z  ) .  Thus  if  Hn  (x_)  are  the  zeros  of  h  (z. )  and  the  sets  X  ,  Y 

X2  1  a  2  X2  1  ^2  ^2 

are  defined  by 


00 

X  =  U  {p.(x  ),x  }, 
^2  S.=l  ^ 


Y  =  u  {np(x5),x  } 

^^2  1=1  ^ 


it  follows  that 


(A. 13) 


(A. 14) 


Let  X  and  Y  be  the  sets  of  zeros  of  g  and  h  respectively.  It  is 
known  [221  that  the  zeros  of  a  function  of  n  complex  variables  form  an 
analytic  set  of  dimension  (n-1)  which  in  turn  is  the  union  of  analytic  mani¬ 
folds  of  dimension  (n-1)  and  a  set  of  singular  points  of  dimension  at  most 
(n-2)  .  Therefore  for  almost  all  X2  the  points  (Pj^(x2),X2)  and  (ri^(x2),X2) 
are  members  of  analytic  submanifolds  of  X  and  Y  respectively;  that  is  for 
each  k  there  exist  maps 


:C‘-^  Y 

W 

such  that  for  almost  all  x.,  41,  and 

2  k 


=  (Pj^(x2),X2) 
=  (nj^(x2),x2) 


(A. 15) 


are  analytic  in  a  neighborhood 


N  (x  )  of  x_.  Furthermore  for  almost  all  x, ,  the  maps 


are  analytic  in  neighborhoods  N  (x_)  of  x-  (note  the 
K  K  X  K  ^ 

dependence  of  N  (x  )  on  k) . 

We  now  suppose  that  the  zeros  well  separated,  i.e. 

^  ‘^£^^2^  Vk,£,  k?^£  (A. 16) 

it 

Since  by  (A.  12)  analyticity  of  and 

imply  that 

yj^(z)  =  Pj^(z)  or  P^(2*)  V2eNj^(x2)  .  (A, 17) 

Therefore  if  X^cx  and  are  the  analytic  extensions  of  the  neighbor¬ 

hoods  of  X  and  Y  to  analytic  manifolds  then 
'‘2  '‘2 

If  Y  7^ X  then  there  exists  a  point  y €  Y,  and  an  associated  neighborhood 
XX  •  X 

N(y)c:Y^  such  that  N(y)c:x*-X^.  By  analytic  continuation  N(y)  can  be 

if 

extended  to  an  analytic  manifold  such  that  Y^cx^^  and  Y^cy^.  if 

Y^  =  Yj^  then  Y^^  =  X*  otherwise  Y^^  must  be  decomposable  into  two  analytic 
submanifolds  Y2  and  Y^  such  that  ^2^^1  ^3*“^1  ^1" 

Thus,  apart  from  alternative  solutions  generated  by  varying  a(X2)  and 
6(x2) ,  a  necessary  condition  that  an  alternative  solution  h  to  the  2-D 
phase  problem  must  satisfy  is  that  some  of  its  zeros  be  the  complex  conjugates 
of  those  of  the  original  solution.  Although  this  is  the  same  mechanism  by 
which  an  infinite  number  of  alternative  solutions  to  the  1-D  phase  problem 
are  generated,  the  zeros  must  now  satisfy  the  condition  that  they  form  a 
union  of  one-dimensional  analytic  manifolds  as  opposed  to  a  union  of  zero 
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dimensional  manifolds,  that  is  a  collection  of  connected  analytic  line  seg¬ 
ments  as  opposed  to  a  collection  of  isolated  points.  If  an  alternative  solu¬ 
tion  exists  then  either  the  whole  manifold  X  has  been  "flipped"  to  its 
conjugate,  or  it  has  been  "torn"  and  only  partially  flipped.  The  connected 
nature  of  implies  that  the  existence  of  "dotted  lines"  along  which  tears 

may  be  made  is  very  unlikely;  this  compares  to  the  isolated  points  in  the  1-D 
problem,  each  of  which  may  be  flipped  independently  of  the  other. 

Given  this  condition  the  form  of  an  alternative  solution  may  be  determined. 
Let  Xj^  be  decomposable  into  submanifolds  X^,  and  define 

X,  i  X.  n  { (z, ,x- ) ;x_  fixed,  z,  GcH  .  (A. 19) 

i,x_  1  1  2  2  1 


then  the  function  g  (z^ )  may  be  written  as  the  product 

X2  1  ^ 


(A. 20) 


where 


92  X  ^ 

'2 


(A. 21) 


By  (A. 12)  h  (z  )  may  be  written  as 

X2  1 


h  (z  )  =  g  (z  )g*  (z*) 

x^  1  1,X2  1  2,X2  1 


(A. 22) 


If  h(z  ,z  )  exists  it  is  the  analytic  extension  of  h  (z  ) ,  therefore 

1  z  X2  1 

h(z^,Z2)  =g^(z^,Z2)g2(z^/Zp  . 

We  have  not  been  able  to  show  that  this  necessary  condition  for  alter¬ 


native  solutions  is  also  sufficient;  i.e.  given  submanifolds  X_,  X,  and 


the  decomposition  of  Eq.  (A. 20)  that  the  h  (z  )  of  Eq.  (A. 22)  may  be  ana- 

^2  ^ 

lytically  continued  to  a  function  hCz^^jZ^)  •  One  source  of  trouble  is  the 

00 

dependence  of  possible  that  for  every  ~ 

Then  although  each  zero  [p^{x^) ,x^)  is  analytic  in  a  neighborhood  of 
there  does  not  exist  a  neighborhood  over  which  all  zeros  are  uniformly  ana¬ 
lytic,  and  therefore  a  neighborhood  over  which  the  product  of  Eq.  (A. 21)  is 
provably  analytic. 

If  the  cardinalities  of  X.  are  finite,  e.g.  g(z  ,z  )  is  a  poly- 

1 , 1  Z 

nomial,.then  sufficiency  can  be  shown.  In  the  polynomial  case  decomposability 
of  into  X^  and  X^  is  equivalent  to  a  factorization  of  the  polynomial. 

However  almost  all  polynomials  of  two  variables,  are  irreducible  so  that  such 
a  factorization  and  decomposition  does  not  exist,  therefore  alternative 
solutions  do  not  exist.  Irreducibility  extends  to  general  functions  of  two 
variables  with  infinite  sets  of  zeros,  so  that  exact  alternative  solutions  are 
most  unlikely  in  2-D  phase  retrieval.  This  result  on  polynomials  and  its 
implications  is  also  presented  in  [23]. 

A.  3  Jhz  SuppoA^  0^  AtteAncutivz  SoZivtiom 

In  the  previous  subsection  conditions  that  alternative  solutions  g  and 
h  must  satisfy  in  order  that  |g(Xj^,X2)  |  =  jh(x^,X2)  |  were  derived.  It  re¬ 
mains  to  derive  necessary  conditions  on  g  and  h  so  that  (support  G) 

=  (support  H) .  The  first  is  that  6(X2)  =0  in  Eq.  (A.12) .  This  follows  by 
noting  that  if  G(Uj^,U2)  is  nonzero  in  neighborhoods  of  points  (u^jU^), 
(Uj^jU^)  6B  then  the  function  G(Uj^,X2)  of  Eq.  (A. 11)  will  be  nonzero  in 
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So  by  Theorem  A. 3 


neighborhoods  of  ~  ~  almost  all  x^. 

h  (z  )  is  the  transform  of  a  function  H(u  ,x_)  with  support  in  (u  ,u^) 
^2  -1-  1  11 

if  and  only  if  6(x2)  SO. 

A  second  condition  follows  from  noting  that  the  boundedness  of  the  set 
B  implies  that  h(z^,Z2)  is  of  exponential  growth  in  z^,  so  that  cx(x^) 
must  only  be  a  linear  function  of  x^.  Summarizing  these  results  and  those 
of  the  previous  section  gives  the  next  theorem. 


The.OA.ejn  A.  4:  Let  g,  G  be  a  solution  pair  to  the  2-D  phase  retrieval  problem. 
Then  any  other  solution  pair  h,  H  must  have  the  form 


i  (01+0222)  ^  ^ 

h(2i,Z2)  =  e  -  gi(Zi,Z2)g2(z3^»Z2^ 


(A. 23) 


where  9j^92  *  factorization  of  g. 

We  have  been  unable  to  complement  these  necessary  conditions  for  equality 
of  support  with  sufficient  conditions  equivalent  to  those  for  the  1-D  problem. 
The  difficulty  seems  to  lie  in  determining  the  role  of  the  geometry  of  B;  we 
give  two  examples 

1.  The  first  example  concerns  convexity  and  is  ta)cen  from  Huiser  and 
Torn  [24].  Let  g,  G  be  a  solution  pair,  then  after  the  change  of  variables 
to  the  new  orthogonal  coordinate  systems  (Si,S2) ,  (ti,t2)  with 


51  =  Ui  cos  +  U2  sin  ip  ^1  ~  ^1  ^ 

52  =  -Ui  sin  ip  +  U2  cos  ^  t2  sin  +  cos  ip 


(A. 24) 


and  definition  of  the  quantities 


=  maxis^ : (s^/S^)  €  b) 


32(3^)  =  min{s2: (s^.s^)  C  b} 


=  max{s^  :  (3^,82)  €  b} 

3j^(i|;)  =  min{3^:  (3^,82)  €  b} 
the  relationship  g=  may  be  rewritten  as 


(A. 25 


g(t^,t2) 


st(s, ) 


..  j;  ^ 


ds^  e 


is_t_ 
2  2 


G(s^,S2)  .(A. 26 


82(3^) 


For  fixed  t  the  growth  rate  in  g  (t  )  is  determined  by  etci//)  and 

^2 

Sj^(i|;r>  Knowing  these  values  for  all  4^  i3  equivalent  to  knowing  all 
supporting  hyperplanes  for  the  set  B,  which  by  duality  arguments  from  linear 
algebra  is  equivalent  to  knowing  the  convex  hull  of  B.  If  h  and  H  is 
any  other  solution  pair  then  h  (t^)  must  have  the  same  growth  as  g  (t  ) 
otherwise  H  has  support  outside  of  the  convex  hull  of  B. 

If  g  has  a  factorization  92^92  that  the  growth  of  g2  is  always 

dominated  by  that  of  g^  (e.g.  g2  is  a  polynomial)  then  the  alternative 
function 

*  *  * 

h{z^,z^)  =  g^(z^,Z2)g2(Zj^»Z2^  (A. 27 

has  the  same  modulus  as  g  and  support  in  the  convex  hull  of  B.  If  B  is 
convex  then  h  is  an  alternative  solution,  if  B  is  not  convex  then  it  is 
possible  that  the  support  of  H  is  not  B  even  though  still  in  the  convex 


2.  Let  g,  G  be  a  solution  pair,  then  it  is  trivial  to  show  that  the 

*  *  *  * 

inverse  transform  of  g  is  G  (-00^, -to^)  which  has  support  -B.  So 

*  *  * 

a  sufficient  condition  that  g  be  an  alternative  solution  is  that 

B=-B,  i.e.  B  is  invariant  under  rotation  by  180°. 

Example  1  suggests  that  convexity  of  B  is  necessary  for  existence  of  an 
alternative  solution  and  taken  with  example  2  suggests  that  for  a  factoriza¬ 
tion  g  into  9j^92  alternative  solution  h  of  Eq.  (A. 27)  then  B 

must  have  symmetries  linked  in  some  fashion  to  those  directions  in  which 
growth  of  dominates  g^. 

A.  4  Conclu^iovib 

Nonuniqueness  in  the  phase  retrieval  problem  in  two  dimensions  appears  to 
depend  on  two  conditions:  (1)  that  the  zero  space  of  g  be  decomposable  into 
a  union  of  several  submanifolds,  (2)  that  B  possesses  a  suitable  combination 
of  convexity  and  symmetry.  Both  conditions  will,  in  general,  be  difficult  to 
satisfy  compared  to  the  1-D  phase  retrieval  problem.  Only  in  the  case  of 
symmetries  that  effectively  reduce  giz^tZ^)  to  a  function  of  one  variable 
(e.g.,  the  possession  of  radial  symmetry  investigated  in  [25])  will  the  mani¬ 
fold  have  an  infinite  decomposition  as  appears  in  the  1-D  problem.  In  most 
cases  it  will  be  indivisible.  Likewise  the  general  two-dimensional  bounded 
set  has  considerably  more  degrees  of  freedom  than  the  one-dimensional  bounded 
set,  the  interval,  consequently  it  has  far  fewer  symmetries.  Therefore  in 
general  the  2-D  phase  retrieval  problem  will  have  a  unique  solution  if  one 


exists . 
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Tcible  6.  The  range  of  nonzero  eigenvalues  in  algorithms  FN  and  FNP.  Note 
that  for  0  =  1.5,  2  the  finite  Fourier  transform  has  at  most  15 
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30 
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26-43 

27-40 

28-39 

28-3 

G^  FNP 

17 

14-24 

16-23 
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- 
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FIGURE  LEGENDS 


Fig.  1.  Two  examples  of  poor  convergence  of  the  alternating  projection 
algorithm  induced  by  ill  conditioning. 

Fig.  2.  The  tangent  plane  T^(x)  to  a  set  at  the  point  P^(x).  T^ (x) 

locally  separates  and  (x) . 

Fig.  3.  An  example  of  the  form  of  the  Hessian  operator  (x)  associated 
with  a  sphere  B^. 

Fig.  4.  A  solution  to  model  problem  I,  test  funption  G  with  F{G)  =8*10  : 

A  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus, 

-  exact  phase. 

x2  ^  -S 

Fig.  5.  A  solution  to  model  problem  I,  test  function  G  with  F(G)  =7*10  : 

A  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus, 

-  exact  phase. 

^2  ^  ‘*4 

Fig.  6.  A  solution  to  model  problem  I,  test  function  G  with  F(G)  =8’10  : 

A  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus, 

-  exact  phase. 

x2  X  - 

Fig.  7.  A  solution  to  model  problem  III,  test  function  G  with  F{G)  =2-10 

A  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus, 

-  exact  phase. 

Fig.  8.  A  solution  to  model  problem  III,  test  function  G  with  F(G)  =3*10 

A  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus, 

-  exact  phase. 
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Fig.  9. 


A  solution  to  model  problem  III ,  test  function  G  with 
%  —2 

F(G)  =7-10  :  A  reconstructed  modulus,  o  reconstructed  phase, 

-  exact  modulus,  -  exact  phase. 

Fig.  10.  An  example  of  preservation  of  symmetry,  a  solution  to  model  problem 
I,  test  function  G  with  G^ = 0. 

Fig.  11.  Sample  realization  solution  to  model  problem  I,  test  function  G^, 

e  =  20%:  A  reconstructed  real  component,  o  reconstructed  imaginary 

component,  - exact  real  component,  -  exact  imaginary  component. 

Fig.  12.  Sample  realization  solution  ^o  model  problem  1,  test  function  G^, 

e=60%:  A  reconstructed  real  component,  O  reconstructed  imaginary 

component,  - exact  real  component,  -  exact  imaginary  component. 

~1 

Fig,  13,  Sample  realization  solution  to  model  problem  1,  test  function  G  , 

e  =  100%:  A  reconstructed  real  component,  o  reconstructed  imaginary 

component,  -  exact  real  component,  -  exact  imaginary  component, 

^1 

Fig.  14.  Sample  realization  solution  to  model  problem  III,  test  function  G  , 

e  =  20%;  A  reconstructed  real  con^sonent,  O  reconstructed  imaginary 

component,  -  exact  real  component,  -  exact  imaginary  component. 

^1 

Fig.  15,  Sample  realization  solution  to  model  problem  III,  test  function  G  , 

e  =  60% ;  A  reconstructed  real  component ,  O  reconstructed  imaginary 

component,  -  exact  real  component,  -  exact  imaginary  component. 

^1 

Fig.  16.  Sample  realization  solution  to  model  problem  III,  test  function  G  , 

£  =  100%;  A  reconstructed  real  consonant,  O  reconstructed  imaginary 
component,  -  exact  real  component,  -  exact  imaginary  component. 
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Fig.  17.  Sample  realization  solution  to  model  problem  I,  test  function 
£=20%;  O  reconstructed  phase,  A  reconstructed  modulus, 

-  exact  modulus,  -  exact  phase. 

Fig.  18.  Sample  realization  solution  to  model  problem  1,  test  function 
£  =  60%;  O  reconstructed  phase,  A  reconstructed  modulus, 

-  exact  modulus,  -  exact  phase. 

Fig.  19.  Sample  realization  solution  to  model  problem  I,  test  function  g" , 

£  =  100%;  o  reconstructed  phase,  A  reconstructed  modulus, 

-  exact  modulus,  -  exact  phase. 

~2 

Fig.  20.  Sample  realization  solution  to  model  problem  III,  test  function  G 
£  =  20%;  o  reconstructed  phase,  A  reconstructed  modulus, 

- exact  modulus,  -  exact  phase. 

y.2 

Fig.  21.  Sample  realization  solution  to  model  problem  III,  test  function  G 
e  =  60%;  o  reconstructed  phase,  A  reconstructed  modulus, 

-  exact  modulus,  -  exact  phase. 

Fig.  22.  Sample  realization  solution  to  model  problem  III,  test  function  G 
£  =  100%:  o  reconstructed  phase,  A  reconstructed  modulus, 

-  exact  modulus,  -  exact  phase. 
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TWO  examples  of  poor  convergence  of  the  alternating  projection 
algorithm  induced  by  ill  conditioning. 


/s2  ^ 

A  solution  to  model  problem  I,  test  function  G  with  F(6)  *8' 


10 


A  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus 


exact  phase. 


A  solution  to  model  problem  I,  test  function  G  with  F(G)  =7*10 
A  reconstructed  modulus,  O  reconstructed  phase,  — —  exact  modulus 


exact  phase. 


MODULUS  PHASt 


2 


Fig.  7. 


*  ( 


o  o 


A  solution  to  model  problem  III,  test  function  G  with  F(G)  >2-10 
t  reconstructed  modulus,  o  reconstructed  phase,  -  exact  modulus 


exact  phase. 


exact  phase. 


o>» 


14.  Seunple  realization  solution  to  model  problem  III,  test  function 

e«20%:  A  reconstructed  real  component,  O  reconstructed  imaginary 
component,  - exact  real  component,  — —  exact  imaginary  component 


c»» 


Fig.  15.  Sample  realization  solution  to  model  problem  III,  test  function 

e«60%;  A  reconstructed  real  consonant .  o  reconstructed  imaginan' 
component,  - exact  real  exponent, - exact  imaginary  co’-^-vnent 


o>* 


r*«lx*ation  solutim  to  model  problem  Ill,  test  function  G  , 
c  •  100%;  L  reconstructed  real  conponent,  O  reconstructed  imaginary 
ceeiponent,  -  exact  real  coaiponent,  — —  exact  imaginary  component 


MODULUS  PHASE 


Sait^le  realization  solution  to  model  problem  I ,  test  function  G 
c«100%:  o  reconstructed  phase,  A  reconstructed  modulus, 

- exact  modulus,  -  exact  phase. 


MODULUS  -  PHASE 


Fig.  20.  Sairple  realization  solution  to  model  problem  III,  test  function 
C>20%:  O  reconstructed  phase,  A  reconstructed  modulus, 


— -  exact  modulus , 


exact  phase. 


no 


MODULUS  -  PHASE 


Fig.  21.  Sample  realization  solution  to  model  problem  111,  test  function  G 
£*60%:  o  reconstructed  phase,  A  reconstructed  modulus. 


—  exact  modulus 


exact  phase. 


Modulus  -  phase 


Fig.  22.  Sanple  realization  solution  to  model  problem  III,  test  function  G 
e«100%;  O  reconstructed  phase,  L  reconstructed  modulus, 

-  exact  modulus,  -  exact  phase. 
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1400  Wilson  Blvd. 

Arlington  VA  22209 


ERIM  1 

Attn:  Or  S  Robinson 

PQOox  8618 

Ann  Arbor  MI  48107 


MRJ  Inc.  1 

Attn:  Dr  K  Robinson 
71  Blake  St. 

Meedham  MA  02192 


ttek  Corp.  1 

Attn:  E.  Oalat 
Optical  Systems  Division 
10  Maguire  Rd. 

Lexington  MA  02173 

DARPA/STO  1 

Attn:  Lt.  Ccl.  Bell 
1400  Wilson  Blvd. 

Arlington  VA  22209 


DL-3 


ArWL/ARAA 

Attn;  Dr  L  Sfeolnik 

i'-irtland  AFQ  871 17 


Aerospace 

Attn;  Dr  V  Maha jan 
"I'S  ri4/978 

i350  E.  El  Sagundo  Blvd. 
£1  Sagundo  CA  90245 

l.ocKhaed 

Attn;  Dr  S  Wlllianis 
3251  Hanover  St. 

Palo  Alto  CA  94304 


Hughes 

Attn;  Dr  R  Withrington 
Build.  El  MS  F/124 
El  Sagundo  CA  90245 


tOSC 

Attn;  Dr  D  Fried 
PQBo*  446 

Placentia  CA  92670 


3DM  Carp. 

Attn;  Dr  E  Silvertooth 

C/0  Jeanette  Miller  (Security  Officer) 

5155  W.  Rosecrans  Ave  Hauthorne  CA  90250 


Riverside  Research 

Attn;  HALO  Library*  Mr  R  Passett 

1701  N  Fort  Meyer  Dr 

r^r  ling  ton  VA  22209 


D ARP A/ DEO 
Attn;  R  Strunce 
1^00  Milson  Blvd. 
Arlington  VA  22209 
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RGB  Assoc. 

Attn:  Prof  R  Barakac 
FOE ox  8 

way  land  MA  01779 


Eikonix  Coro. 

Attn:  Dr  R  Gonsalves 

23  Crosoy  Rd. 

Bedford  MA  01730 


NY  IT 

Institute  for  Optics 

Prof  ri  Halioua 

Old  Westbury  NY  11568 


DL-5 


■  ^  ' 


I*  f- 


I 


MISSION 

of 

R(me  Air  Development  Center 

MOC  ptam  and  execMX&i  4e6aa'ic/t,  deveZopmznt,  tut  and 
■6eZexUed  acqul&ttcon  pfLogfiam  In  6uppo>it  Command,  ContAot 
Coimunicatcom  and  InteZtigence  (C^I)  actioitiu.  Technical 
and  engtnee/Ung  AappoAX  wJOnJin  okcaa  oi  technical  competence 
i6  provided  to  ESP  Paogaam  OHiceA  (POa)  and  othea  ESP 
elements.  The  principal  technical  ml66lon  oAeoA  ate 
communications,  electtomagnetic  guidance  and  control,  sua- 
velilance  o^  gAound  and  aetospace  objects,  inteUUgence  data 
coltection  and  handling,  In^oAmation  system  technology, 
lonospheAlc  pAopagatlon,  solid  state  sciences,  micAowave 
physics  and  elec^onic  Aeliablllty,  maintainability  and 
compatibility. 


